软件设计中最基本的问题之一是:给定两个功能,它们应该在同一位置一起实现,还是应该分开实现?这个问题适用于系统中的所有级别,例如功能,方法,类和服务。例如,应该在提供面向流的文件 I/O 的类中包括缓冲,还是应该在单独的类中?HTTP 请求的解析应该完全在一种方法中实现,还是应该在多个方法(甚至多个类)之间划分?本章讨论做出这些决定时要考虑的因素。这些因素中的一些已经在前面的章节中进行了讨论,但是为了完整起见,这里将对其进行重新讨论。
在决定是合并还是分开时,目标是降低整个系统的复杂性并改善其模块化。看来实现此目标的最佳方法是将系统划分为大量的小组件:组件越小,每个单独的组件可能越简单。但是,细分的行为会带来额外的复杂性,而这在细分之前是不存在的:
- 一些组件的复杂性仅来自组件的数量:组件越多,就越难以追踪所有组件,也就越难在大型集合中找到所需的组件。细分通常会导致更多接口,并且每个新接口都会增加复杂性。
- 细分可能会导致附加代码来管理组件。例如,在细分之前使用单个对象的一段代码现在可能必须管理多个对象。
- 细分产生分离:细分后的组件将比细分前的组件相距更远。例如,在细分之前位于单个类中的方法可能在细分之后位于不同的类中,并且可能在不同的文件中。分离使开发人员更难于同时查看这些组件,甚至很难知道它们的存在。如果组件真正独立,那么分离是好的:它使开发人员可以一次专注于单个组件,而不会被其他组件分散注意力。另一方面,如果组件之间存在依赖性,则分离是不好的:开发人员最终将在组件之间来回翻转。更糟糕的是,他们可能不了解依赖关系,这可能导致错误。
- 细分可能导致重复:细分之前的单个实例中存在的代码可能需要存在于每个细分的组件中。
如果它们紧密相关,则将代码段组合在一起是最有益的。如果各部分无关,则最好分开。以下是两个代码相关的一些提示:
- 他们共享信息;例如,这两段代码都可能取决于特定类型文档的语法。
- 它们一起使用:任何使用其中一段代码的人都可能同时使用另一段代码。这种关系形式只有在双向关系中才具有吸引力。作为反例,磁盘块高速缓存几乎总是包含哈希表,但是哈希表可以在许多不涉及块高速缓存的情况下使用。因此,这些模块应该分开。
- 它们在概念上重叠,因为存在一个简单的更高级别的类别,其中包括这两段代码。例如,搜索子字符串和大小写转换都属于字符串操作类别。流控制和可靠的交付都属于网络通信的范畴。
- 不看其中的一段代码就很难理解。
本章的其余部分使用更具体的规则以及示例来说明何时将代码段组合在一起以及何时将它们分开是有意义的。
One of the most fundamental questions in software design is this: given two pieces of functionality, should they be implemented together in the same place, or should their implementations be separated? This question applies at all levels in a system, such as functions, methods, classes, and services. For example, should buffering be included in the class that provides stream-oriented file I/O, or should it be in a separate class? Should the parsing of an HTTP request be implemented entirely in one method, or should it be divided among multiple methods (or even multiple classes)? This chapter discusses the factors to consider when making these decisions. Some of these factors have already been discussed in previous chapters, but they will be revisited here for completeness.
When deciding whether to combine or separate, the goal is to reduce the complexity of the system as a whole and improve its modularity. It might appear that the best way to achieve this goal is to divide the system into a large number of small components: the smaller the components, the simpler each individual component is likely to be. However, the act of subdividing creates additional complexity that was not present before subdivision:
- Some complexity comes just from the number of components: the more components, the harder to keep track of them all and the harder to find a desired component within the large collection. Subdivision usually results in more interfaces, and every new interface adds complexity.
- Subdivision can result in additional code to manage the components. For example, a piece of code that used a single object before subdivision might now have to manage multiple objects.
- Subdivision creates separation: the subdivided components will be farther apart than they were before subdivision. For example, methods that were together in a single class before subdivision may be in different classes after subdivision, and possibly in different files. Separation makes it harder for developers to see the components at the same time, or even to be aware of their existence. If the components are truly independent, then separation is good: it allows the developer to focus on a single component at a time, without being distracted by the other components. On the other hand, if there are dependencies between the components, then separation is bad: developers will end up flipping back and forth between the components. Even worse, they may not be aware of the dependencies, which can lead to bugs.
- Subdivision can result in duplication: code that was present in a single instance before subdivision may need to be present in each of the subdivided components.
Bringing pieces of code together is most beneficial if they are closely related. If the pieces are unrelated, they are probably better off apart. Here are a few indications that two pieces of code are related:
- They share information; for example, both pieces of code might depend on the syntax of a particular type of document.
- They are used together: anyone using one of the pieces of code is likely to use the other as well. This form of relationship is only compelling if it is bidirectional. As a counter-example, a disk block cache will almost always involve a hash table, but hash tables can be used in many situations that don’t involve block caches; thus, these modules should be separate.
- They overlap conceptually, in that there is a simple higher-level category that includes both of the pieces of code. For example, searching for a substring and case conversion both fall under the category of string manipulation; flow control and reliable delivery both fall under the category of network communication.
- It is hard to understand one of the pieces of code without looking at the other.
The rest of this chapter uses more specific rules as well as examples to show when it makes sense to bring pieces of code together and when it makes sense to separate them.