Encapsulation: Looking Beyond the Surface 🔍

Encapsulation doesn't eliminate complexity; it merely organizes it into manageable pieces.

set 12, 2023

One of the fundamental pillars of OOP is called encapsulation. However, some developers still don't grasp the purpose and true power of encapsulation. I hope this post can help shed some light on this crucial pillar! If you enjoy the content, please share and leave a like on the post!😄

The Origin of the Word ⛩️

The term "encapsulation" is derived from the verb "encapsulate," which means to enclose or contain something within a capsule. The word has its roots in Latin “capsula,” a diminutive form of “caput,” meaning "head." Thus, at its core, encapsulation is about protection, the idea of containing something to preserve its integrity.

Demystifying Encapsulation 🤔

You've surely heard, "Encapsulation is about hiding information from a class." Many developers, especially the beginners, think that by simply making the members of a class private and concealing them from client classes, they are applying encapsulation. However, it's crucial to understand that encapsulation goes beyond just hiding information.

Another perspective defines encapsulation as the combination of data and behaviors. Although not incorrect, this definition doesn't fully capture the essence of the concept. It's an approach to achieve good encapsulation, but it doesn't entirely define the pillar.

Information Hiding and the Combination of Data and Behaviors are tools that are part of true encapsulation, but when applied in isolation, they don't completely define it.

Why is it so important to talk about encapsulation? 🧐

It all comes down to complexity. The more code we have in a software, the more complex it ends up becoming. And code complexity is one of the biggest challenges we face in enterprise software development. The more complex the codebase becomes, the more difficult it is to work with. The results? The speed of developing new features decreases. The number of inconsistencies and bugs increases greatly, even causing financial problems for companies. And last but not least, the ability to respond to specific business needs ends up being compromised!

Without encapsulation we have no practical and effective way of dealing with code complexity. It's true that we can never stop or regress the entropy of a codebase. But we can keep it under control and not be controlled by it! Furthermore, without encapsulation, software components become disorganized, classes lack cohesion and this leads to a greater mental load on the programmer to respond quickly to changes and understand complex flows without altering the behavior that the system already has.

Encapsulation does not eliminate complexity; he just organizes it into manageable chunks.

Encapsulation provides us with a healthy alternative for code, for companies' pockets and for our own minds, how? Maintaining the organization and cohesion of the components. But as we have already said, we really need to understand the essence of this very important pillar! Let's now talk about the heart of this principle.

The Heart of Encapsulation ❤️

So how can we define it? See the most suitable definition below:

Encapsulation is the act of protecting the integrity of data.

This is typically achieved by preventing the clients of a class from setting or altering its internal data to an invalid or inconsistent state. Let's dive deeper into this. 😄

When we talk about "protecting the integrity of data", we're referring to the idea that an object's data should be safe from unwanted or unintentional modifications. Imagine the data as a treasure inside a safe. Encapsulation is that safe, allowing only the right people, with the correct key (or correct methods), to access or modify that treasure.

Now, to elaborate further, when we mention "clients of a class", we're talking about other parts of the code (components, artifacts, or classes) or programs that want to use or interact with that class. The idea is that not all parts of the code should be able to directly modify the class's internal data. Instead, they should use specific methods the class provides. For instance, rather than letting anyone set the value of a variable directly, you'd have a method to set that value. This method could then check if the new value is valid before accepting it. Thus, encapsulation is essentially a practice ensuring that a class's data is accessed and modified in a controlled and secure manner. This helps prevent errors and maintains data integrity, ensuring it remains consistent and in a valid state.

And where do the two tools mentioned previously fit in?

It's insightful to illustrate for clarity. Information hiding can be envisioned as the walls and doors of a house. Upon entering a house, we only see the rooms and furnishings available to us. We don't see the internal pipework, electrical wiring, or the concrete and steel structure supporting the building. And, in fact, we don't need to see or interact with these elements to inhabit the house. This is the essence of concealment. It's not just about hiding but providing a focused experience, showcasing only what's necessary to perform a specific task. In terms of programming, this prevents the intricate and specific details of a class from being unnecessarily exposed, reducing the likelihood of errors and making the software easier to use and understand.

On the other hand, the Binding of Data and Behaviors can be likened to the workings of a mechanical clock. Each gear, on its own, serves a purpose, but it's their combined operation, the interaction among these parts, that allows the clock to accurately keep time. In OOP, this binding ensures that the data and operations on them are interconnected, allowing the class to operate as a cohesive unit. This guarantees that data is always handled appropriately since the logic and information are intrinsically linked. This approach strengthens encapsulation, ensuring that data is safeguarded and accessed only by appropriate methods, which in turn maintains the system's integrity and consistency.

Through encapsulation, we ensure that data remains consistent and uncorrupted.

These two tools are essential to achieve the primary goal of safeguarding data integrity! And this brings us to another cool topic, invariants! Don't worry, there's nothing complicated about it; let's delve deeper!

Encapsulation and Invariants: Ensuring the Stability of Your Software 🫱🏼‍🫲🏼

To delve into the relationship between encapsulation and invariants, we first need to understand what "invariant" means within the programming context. Essentially, an invariant is a condition or property that always remains true in a system, regardless of the operations or transformations the system undergoes. It's an aspect of the system that doesn't change, hence the name. Now, why are invariants crucial in programming, especially in object-oriented programming?

Think of an object as a "contract." When an object is created, it promises to behave in a certain manner and follow specific rules. These rules are, in many cases, invariants. They ensure that, regardless of the interactions with the object, certain conditions will always be upheld. This is where encapsulation comes into play.

Encapsulation is a potent tool to ensure an object's invariants are maintained. By limiting direct access to an object's data and providing only specific interfaces (methods) to interact with this data, we're essentially safeguarding that object's "contract." We ensure that operations that might breach the invariants are not performed inadvertently.

For instance, consider a class representing a date. An invariant of this class could be that the day of the month must always be between 1 and 31. If we allow direct access to the variable storing the day of the month, we risk external code setting this variable to, say, 50, thereby breaking the invariant. However, by encapsulating this variable and providing a method to set the day, we can check if the provided value is within the acceptable range, thereby keeping our invariant safe.

In practical terms, thinking about encapsulation in terms of invariants is like designing a fortress around your software's essential rules. It's not merely about hiding details but safeguarding the system's behavior foundations.

To clearly understand how encapsulation works hand in hand with invariants, let's consider a simple yet illustrative object: the square. This geometric object has well-defined characteristics that we can model in software.

Invariants:

All sides of the square have the same length.
The angles between the sides are always 90 degrees.
The area of the square is always the side length squared.

public class Square
{
    private double _side;

    public Square(double side)
    {
        SetSide(side);
    }

    public void SetSide(double value)
    {
        if (value <= 0)
            throw new ArgumentException("Side value must be positive!");

        _side = value;
    }

    public double Area()
    {
        return _side * _side;
    }
}

Now, let's see how encapsulation helps maintain our invariants:

Sides of Equal Length: We use a private field _side to store the length of the square's side. This field is accessed only through the DefineSide method, which ensures the entered value is valid. This way, all the sides of the square always have the same length.
90 Degree Angles: This invariant is intrinsic to the definition of a square. We ensure this by not allowing different sides to be defined separately.
Area: The Area method returns the calculation of a square's area based on the standard mathematical formula. Since we use the encapsulated value of _side and ensure its integrity through the DefineSide method, the returned area will always be correct.

In this example, encapsulation serves as a tool to maintain the invariants of the square. By limiting the way the object's internal data is accessed and modified, we ensure our Square object always behaves like a real square in the mathematical world. This simplifies the use of the class and ensures reliability and accuracy in the operations performed on it.

We can further enhance this object and include a method to retrieve the side value (a "getter") if it's needed for some external use, for example:

public double GetSide()
{
   return _side;
}

This would give the class the flexibility to allow the side value to be read, but still protect the integrity of the data by not allowing it to be directly altered without validation.

Share Chronicles of a Pragmatic Programmer

Benefit of Invariants and Modularity in the Developer's Daily Life 💻

In the vast world of programming, every detail matters. It's in this intricate realm that invariants shine like a beacon, ensuring that we, developers, have the confidence that our objects will behave as expected. But how, exactly, does this characteristic reflect in practice? How does it shape the way we program and structure our systems?

The Problem: Increasing Complexity

As software grows and becomes more complex, it becomes increasingly difficult to keep track of every detail. In this scenario, classes and objects become critical points, as each of them can contain multiple operations and behaviors.

Imagine, for instance, an e-commerce system. Within it, we'll have several classes: Product, ShoppingCart, User, Order, among others. If you're developing a feature that involves the ShoppingCart, do you really want to understand all the details of how the Product or User is implemented? Probably not. You want to trust that they will work as expected.

The Value of Invariants

This is where invariants come into play. They are like silent contracts that each object makes, ensuring that, regardless of what happens, certain conditions will always hold true.

Let's think about the Order object in our e-commerce system. One of the invariants might be that an order always needs to have at least one product. Another invariant might be that an order, once paid, cannot have its total value changed.

Practical Example in C#

Imagine that we have the following class:

public class Order
{
    private List<Product> _products = new List<Product>();
    private bool _hasBeenPaid = false;

    public Order(List<Product> products)
    {
        if (!products.Any())
            throw new ArgumentException("Order must have at least one product.");

        _products = products;
    }

    public decimal CalculateTotal()
    {
        return _products.Sum(p => p.Price);
    }

    public void FinalizeOrder()
    {
        _hasBeenPaid = true;
    }

    public void AddProduct(Product product)
    {
        if (_hasBeenPaid)
            throw new InvalidOperationException("It's not possible to add products to an already paid order.");

        _products.Add(product);
    }
}

In this class, we have two clear invariants:

An order will always have at least one product.
Products cannot be added after the order has been finalized.

And this brings us benefits:

Private Fields: The fields _products and _hasBeenPaid are private. This ensures that direct access to these fields is restricted to the class, protecting their integrity.
Validations: The class constructor checks if the product list is empty before accepting the order. This ensures that an order always has at least one product.
Methods that Control State: The AddProduct method checks if the order has already been paid before allowing the addition of new products. This is a good example of encapsulation, where the object controls its own state and ensures that its internal rules are followed.

For a developer working on another module of the software, this predictability is a relief. He can use the Order class with confidence, without worrying that his code might accidentally create an order without products or add products to an already finalized order.

The Connection to Modularity

Invariants ensure modularity. They allow us to treat each class or object as an independent unit - a black box. This is crucial for large-scale software development.

For instance, if a team member is working on a feature that involves payment processing, they don't need to delve into all the details of the Order class. They only need to be familiar with its invariants and the available public methods.

Impact on the Daily Routine

The value of this in everyday life is immense. When you pick up a piece of code written by someone months (or even years) ago, it's comforting to know there are clear rules that this code adheres to, and that the logic is safeguarded by encapsulation. You don't have to sift through every line to grasp all the nuances. The invariants give you a clear view of what to expect.

Moreover, when it comes to testing your code, invariants provide a clear guide. You know precisely which conditions must always hold true, and you can write tests to ensure they remain so.

The Underestimated Danger of Poorly Defined Invariants ⚠️

But what happens when these invariants aren't clearly defined or specified? What are the real and profound consequences of this oversight?

The Perilous Misalignment

Alignment between the technical team and the business team is more than just best practice - it's the backbone of effective software development. When there's misalignment, we're not just talking minor bugs or glitches but systemic failures that can harm data integrity, user experience, and ultimately the commercial viability of the software.

But how does this occur? And why is it so prevalent?

Business teams often focus on the product's overarching objectives, while developers hone in on granular technical details. The gap between these two perspectives is where poorly defined invariants take root.

The Profound Impact of Inadequate Invariants

An ill-defined invariant is not just a technical oversight - it's a failure in communication. And this misstep can have numerous ramifications:

System Evolution Limitation: A system that can't easily adapt to new requirements due to rigid or ill-defined invariants becomes a liability.
Introduction of Complex Bugs: A vague invariant can let errors slip through during testing, emerging in production environments.
Increased Maintenance Cost: Fixing an ill-defined invariant post-implementation is far more costly than getting it right from the start.
Erosion of Trust: Failures caused by improper invariants can erode user and stakeholder confidence in the software.

Invariant Evolution and Adaptation

In any computing field, be it software development, systems engineering, or even research, one principle remains steadfast: evolution. The business world is dynamic, and with it, the needs and demands shaping software shift. Invariants, which act as stability and predictability pillars in our systems, are not immune to this dynamic. Effectively and safely adapting and evolving invariants is vital to preserving a system's integrity and relevance.

Picture a scenario where we're developing software for a bookstore. A set invariant might state, "A book must always have an associated author." This rule seems sensible and serves the system's initial purpose. However, as the business expands and decides to sell magazines and periodicals, which often don't have a single associated author, the once sensible invariant now poses a constraint.

The Impact of Changing Invariants

Altering an invariant isn't trivial. Changing them directly impacts system behavior and expectations. It's not an action to be rushed as the implications can be vast.

Taking our bookstore example, were we to merely remove the invariant demanding an author for every book, we'd introduce a wide array of potential problems. We might end up with author-less books, confusing customers and muddling the cataloging process. Moreover, functions that once assumed an author's existence for every book could now break or yield unexpected results.

The Safety Net of Unit Testing

This is where the critical role of unit testing comes into play. As invariants change, unit tests act as the first defense line, pinpointing code areas that now breach the new invariants.

Suppose we have tests checking for an author linked to each book. When attempting to introduce author-less magazines, these tests would fail, indicating exactly where the code needs tweaking to fit the new invariant.

The Cohesion between Development and Business

A common pitfall when evolving invariants is to overlook the business perspective. While developers are focused on ensuring technical integrity, the business team is more concerned with usability, relevance, and value delivered by the software.

It's crucial that there's clear and regular communication between the development and business teams. A proposed change in the invariant "A book must always have an associated author" could be discussed with the business team. Perhaps the solution isn't to remove the invariant, but to adapt it to "A sale item must have at least one associated credit entity", where "credit entity" could be an author, publisher, or organization.

But how does encapsulation fit into this? Let's dive in.

A Guardian for the Invariants

In the bookstore scenario we addressed earlier, the invariant that "A book must always have an associated author" is crucial for the system's proper functioning. If this rule is inadvertently broken, the system might face errors and inconsistencies. Here, encapsulation proves valuable by ensuring that the internal state of the object - in this case, the association between a book and its author - can only be changed through well-defined operations that respect the established invariants.

Example to visualize in code what we just discussed 👇🏻:

public class Author
{
    public string Name { get; private set; }

    public Author(string name)
    {
        Name = name ?? throw new ArgumentNullException(nameof(name));
    }
}

public class Book
{
    private Author _author;

    public string Title { get; private set; }

    public Author Author 
    { 
        get 
        {
            return _author; 
        }
    }

    public Book(string title, Author author)
    {
        Title = title ?? throw new ArgumentNullException(nameof(title));
        _author = author ?? throw new ArgumentNullException(nameof(author));
    }
}

Flexibility in Adaptation

Encapsulation also offers a flexible way to adapt the system to changes. If, for instance, the bookstore decides to sell books that don't necessarily have a single author, the class could be easily adapted to support this new demand without exposing the details of this change to the rest of the system. We can see this below:

public class Book
{
    private List<Author> _authors;

    public string Title { get; private set; }

    // To maintain compatibility with existing code:
    public Author Author 
    { 
        get 
        {
            return _authors.FirstOrDefault(); 
        }
    }

    public IReadOnlyList<Author> Authors => _authors.AsReadOnly();

    public Book(string title, Author author)
    {
        Title = title ?? throw new ArgumentNullException(nameof(title));
        _authors = new List<Author> { author ?? throw new ArgumentNullException(nameof(author)) };
    }

    public Book(string title, List<Author> authors)
    {
        Title = title ?? throw new ArgumentNullException(nameof(title));
        _authors = authors ?? throw new ArgumentNullException(nameof(authors));
    }

    public void AddAuthor(Author author)
    {
        _authors.Add(author ?? throw new ArgumentNullException(nameof(author)));
    }
}

The "Book" class presented is an elegant illustration of encapsulation in action, balancing the need for adaptability with robustness. Let's delve into its key features:

Private List of Authors: The _authors list is private, meaning that direct access to it is restricted and control over its modifications is exclusive to the "Book" class. This safeguards the integrity of the authors' list.
Restricted Access Properties: The "Title" property is immutable after its creation, ensuring that once assigned, a book's title cannot be externally altered.
Compatibility with Existing Code: The "Author" property was retained even after migrating to support multiple authors. It returns the first author in the list, preserving the functionality for code that expected this property.
Read-Only List for Consumers: The "Authors" property returns a read-only version of the internal list. This prevents external consumers from modifying the list directly, but they can still view its contents. This technique provides data access without sacrificing integrity.
Defensive Constructors: The constructors of the "Book" class ensure that neither the title nor the authors can be null. If either of them is, an exception will be thrown. This protects the "Book" object from entering an invalid state.
Author Addition Method: Instead of allowing direct modification of the author's list, an "AddAuthor" method is provided. This gives the "Book" class full control over how and when an author can be added, ensuring that any additional rules (if existing in the future) can be easily implemented.

However, I would like to extract three everyday advantages for us programmers. I'll quickly list some:

Backward Compatibility: By retaining the "Author" property, which returns the first author from the list, the class remains compatible with existing code. This is valuable because it means that any other class or function that was using the "Book" class previously doesn't need modifications. In other words, if other code was expecting to access a book's author through the "Author" property, it can still do so without knowing that internally, the "Book" class now supports multiple authors.
Flexibility for Future Expansion: The way the class is structured makes it highly adaptable for future changes. For example, if we decide to implement more rules on how authors are added, we can do so easily without disrupting the existing functionality.
Maintenance Simplification: When changes are needed, they can be made in the "Book" class without affecting parts of the system that rely on it. This isolates the impact of changes and simplifies software maintenance.

And all of this significantly eases maintainability!

The Pillar of Maintainability

Encapsulation not only safeguards invariants but also fosters more effective maintainability. Since the inner details are well-guarded, development teams can focus on optimizing, refactoring, or extending the class without the constant fear of breaking dependent functionalities.

For instance, if in the future, the bookstore decides to categorize authors by literary genre, this change could be internally incorporated without affecting the rest of the system. Encapsulation would ensure that, even with this added complexity, the original promise of the class - that every book has at least one credited entity - would remain unchanged.

Public Interface

Now it's time to discuss public interfaces of a class. In object-oriented programming, a public interface refers to the set of methods, properties, events, and other members that are intentionally exposed for access or use by other classes or components. In other words, it's the "contract" that a class offers to the outside world, specifying how external consumers can interact with it.

The public interface is the visible façade of your class; it's what developers see and engage with when using the class in other parts of the software. This interface allows developers to access and utilize the functionality of the class without necessarily having to understand the details of its internal implementation. This is crucial to the concept of encapsulation.

Through the public interface, developers can:

Interact with the class: Create instances, call methods, access properties, etc.
Extend or implement the class: In languages that support inheritance or interfaces.
Test the class: The public interface provides a means to test the functionality of the class, validating if it's fulfilling its contract.

However, it's important to note that not all members of a class are part of its public interface. Those not intended for external use are often marked as private or protected. We will understand this better with an example.

Below, see the class StringValidator:

public class StringValidator
{
    private string _value;

    public StringValidator(string value)
    {
        _value = value;
    }

    public bool IsValid()
    {
        return IsNotEmptyOrNull() 
            && HasAtLeastEightCharacters() 
            && ContainsUppercaseLetter() 
            && DoesNotContainSpecialCharacters();
    }

    private bool IsNotEmptyOrNull()
    {
        return !string.IsNullOrEmpty(_value);
    }

    private bool HasAtLeastEightCharacters()
    {
        return _value.Length >= 8;
    }

    private bool ContainsUppercaseLetter()
    {
        return _value.Any(char.IsUpper);
    }

    private bool DoesNotContainSpecialCharacters()
    {
        return _value.All(ch => char.IsLetterOrDigit(ch) || char.IsWhiteSpace(ch));
    }
}

The rules for validating the string are clearly encapsulated and not exposed. Only the IsValid() method is public. This means that those using the class do not need to concern themselves with the details of these rules or run the risk of violating any invariant.

Moreover, by encapsulating each rule within its own private method, it becomes easier to test and modify the class in the future.

We can further enhance the class by using the readonly keyword in Csharp. But what's the purpose? It can be used to declare that a field (or instance variable) of a class or struct can only be assigned during its declaration or within the constructor of the type it belongs to. In other words, once a value has been assigned to a readonly field, it cannot be changed.

The purpose of readonly is to provide an assurance that the value of a field will remain constant after initialization, thereby allowing the creation of truly immutable objects or protecting certain internal aspects of a type from undesired modifications. And what are the benefits of adhering?

Benefits of using readonly include:

Guaranteed Immutability: Helps ensure that once an object is created in a particular state, it remains in that state.
Clear Intent: Clearly indicates to other developers that the intent is for the value not to change after initialization.
Protection Against Mistakes: Prevents accidental errors where a field might be undesirably reassigned elsewhere in the code.

These details make a difference in daily work, but be careful! ⚠️

🚨 Before using any keyword from a programming language, seek to study; it's crucial to understand the appropriate time to apply! 🚨

Even though the class seems cohesive in our example above, I'd like to highlight some points of attention:

Complexity and Cohesion: Having many private methods might signal that your class is doing too much. Every class should have a single responsibility (Single Responsibility Principle). If you find many private methods that seem to handle different responsibilities, it might be time to consider refactoring to separate those responsibilities into distinct classes.
Testability: While public methods are easily testable through unit tests, private methods are not directly accessible for testing. If a class has many complex private methods, you might find it hard to thoroughly test the logic within these methods without exposing them.
Readability: Even if each private method does something specific and well-defined, having a large number of them can make the class code hard to read and understand, especially for new developers unfamiliar with the class.
Maintenance: More methods mean more code to maintain. If a class has many private methods, modifications and refactoring can become more challenging as changes in one method might have side effects on others.
Over-encapsulation: Although encapsulation is a fundamental pillar, you should consider whether you're encapsulating logic just for the sake of encapsulation or if there's a clear and defined reason. Sometimes, small operations can be performed directly in the method calling them, rather than being encapsulated in a separate method.
Reusability and Coupling: Private methods are not reusable outside the class. If you notice you need similar functionality in another class, you might be tempted to copy and paste the method, which isn't ideal. In such cases, consider making the method protected or public, or move the logic to a utility or base class.

I'd like to reinforce a few more things before wrapping up the post. 😄

Visibility vs Vulnerability

When we design a class, we are essentially creating a contract. The functionalities exposed - whether they are methods, properties, or events - form promises about what the class can do. However, with every promise we make, there comes a commitment.

An overly expansive public interface can become a nightmare to maintain. Every time a method is exposed, it becomes part of the class contract. This means that in the future, if we decide to change how this method works internally, we might end up breaking the code that depends on it.

On the other hand, a very restricted public interface can make the class useless or challenging to be used by other parts of the code. The class might turn into a "black box" where the desired functionality is there but is inaccessible.

Designing for Evolution

One of the main reasons to deeply care about the public interface is code evolution. At some point, almost all classes will undergo refactorings or extensions. A well-designed public interface facilitates these changes. It allows you to change the class's internal implementation without affecting the class's consumers. This quality is known as encapsulation.

Testability as a Compass

A good public interface isn't just about what other developers see and use. It's also about what tests can access. When designing a class, it's essential to think about how it will be tested. Methods that are private may be inaccessible for direct testing, making it harder to validate specific behaviors. However, exposing a method solely for testing can be bad practice.

This brings us to a delicate balance: how "open" or "closed" should our class be? A common strategy is to expose functionality through public methods, as we mentioned earlier in the ValidadorString example, a public interface that captures the main "use cases" of the class while keeping the more granular and specific logic private.

Thank you for reading Chronicles of a Pragmatic Programmer. This post is public so feel free to share it.

Conclusion

Wow! So much to discuss about encapsulation. It seems like a superficial subject that doesn't have much weight. But when we get to the deeper study of how to apply it on a daily basis, we see that there is a lot of room for improve ment and a lot to gain from this vital pillar of OOP.

Looking beyond the surface, we realize it's not just about hiding data but about protecting a system's integrity, ensuring its effective maintenance, and fostering clear communication between software components.

When we use encapsulation correctly, we can build more resilient systems, less prone to errors, and more adaptable to changes. In this sense, encapsulation resembles a sculptor's work: not only shaping the external form but deeply concerned with the structure and internal integrity of their masterpiece.

For developers striving for excellence, it's essential to see encapsulation not just as a mere requirement but as a powerful tool that, when used correctly, can elevate software quality to new heights. So always remember to look beyond the surface, for therein lies the true art of programming.

Chronicles of a Pragmatic Programmer