Gathering Scattered Code

This page describes four related strategies. The main article describes how to compose them to Gather a Scattered Component.

The strategies on this page are:

  • Remodeling Code, Not Refactoring: The difference between them and how to use them together.
  • Overall Sequence: What execution order we will create, why, and the strategy we’ll use to get there.
  • Where to Start: Why we gather before de-isolating, even though it seems like the cross-module scattering is more valuable to fix first.
  • The Process: What the steps find, gather, merge, and de-isolate really mean.

Remodeling Code, Not Refactoring

Both remodeling and refactoring attempt to maintain all user-visible behavior. However, they differ in how they accomplish that intention. Provable refactorings accomplish this intention by guaranteeing that there is no change in program behavior. Barring an error, remodeling accomplishes this intention by assembling a new program that the developer believes will perform similarly.

The component extraction is a remodeling. We intend to not change user-visible behavior, but we do change program behavior such as execution order.

However, we want to make the remodeling as safe as possible. We accomplish this by remodeling only in very constrained ways. We will change execution order between blocks. We will simplify our remodeling by refactoring our blocks until they are very simple to reason about.

Getting the Right Blocks — Commands and Queries

We can safely re-order blocks, if we can limit their possible effects. We need to change execution order, which can be done safely if we know how the data flow in and out of each block. One set of changes are safe for a method where data only flows out. A different set of changes are safe if data only flows in.

Many recipes will take different actions depending on whether the code is a Command, a Query, or both.

  • Query: code which gathers and analyzes current state but does not change state. Data flows out.
  • Command: code which changes state without reading the current state. Data flows in.

Commands may be parameterized and may include conditions, but the code does not gather data. If there are conditions, each branch is doing “basically the same thing,” just in different contexts. It is better if there are no conditions at all. Instead we can move the conditional logic to a Query, which will return a complete result, ready for action.

back to top

Overall Sequence

The component’s code starts out scattered all over the module. Creating our component requies that we collect all this code. However, there is a bigger obstacle: our component’s code is also logically scattered.

If you were to freeze execution at any moment, component code would be scattered across the call stack. The full execution tree has component blocks scattered among foreign blocks.

Merging our component requires simplifying the execution tree. Foreign code may exist both above and below our component’s code in the stack, but our component should cluster as a single lump.

Defragmenting the Call Stack

As discussed above, we can safely move code if it is either a Command or a Query. Thus when cleaning up the call stack, we care about four kinds of code:

  • Component Queries
  • Component Commands
  • Foreign Queries
  • Foreign Commands

Initially, these four kinds of code happen in a random order. We want to sort them so that we can merge all the component code together. To do that, we notice the following constraints:

  • A Command has to execute after the Query that gets the data it needs.
  • Queries may execute in any order.
  • A Query and Command may execute in either order unless the Query reads a value that the Command also writes.
  • Commands may execute code inside the component in any order, but the events must happen in the same order as the original code.

Those rules will allow us to transform any execution tree into the following:

  1. All the Foreign Queries, in any order.
  2. All the Component Queries, in any order.
  3. All the Component Commands, in any order.
  4. Fire all the component events, in the original order.
  5. All the Foreign Commands, in the original order.

That executes all the component code in a single lump, which we can then extract to a single API method.

back to top

Where to Start?

Should we start by gathering the scattered code within each module, or by merging the isolated code across modules?

The cross-module problems are more painful.

  • It is frustrating to look through an entire module for some chunk of code, only to find that it is actually in a totally different module.
  • It prevents simple design. The good architecture for the application introduces useless separation and complexity at the scale of this one component.
  • It is easy to get permission to fix them because both of these problems are easy to describe to non-coding stakeholders or to other teams.

Therefore, it seems like the right place to start is by merging the isolated cross-module code.

But it isn’t. We should start by gathering the scattered code within each module. Here’s why.

  • Scattering makes it hard to see the right solution. Sharing data between modules will be very difficult to solve when those concepts aren’t yet well-defined within either module.
  • Scattered code slows us down more than isolated code. This is because we need to search within a module for scattered code far more frequently than we search across modules. Even when searching across modules, we first spend a lot of time searching within one module.
  • Gathering scattered code is cheap. Low-level scattering can be solved incrementally in tiny chunks. We can spend 5 minutes, accomplish something useful, and check in.
  • Merging isolated code is expensive. To merge incrementally, we must quickly move large chunks of code that are relatively free of dependencies. We can only make progress in 5 minute commits if we already have already gathered the scattered code and broken dependencies.

In short, the in-module scattering hurts us more while it exists, and we need an incremental process if this is going to work at scale. Therefore we start by addressing the in-module scattering.

back to top

The Process

The strategy, considerations, and gotchas are described in each of the steps below for the process of find, gather, merge, and de-isolate.

Find

The hardest part is to find the code that belongs in the component. Last month’s solution used the active stories to find a lot of the component’s code, but there will still be more.

The Find step is all about locating code and entities related to the component. We find code by following data and names. We find entities by following code, specifically data flows through the call stack and shared field access.

Gather

We can refactor each code chunk into our component as we find it. We do the same as we did last month:

  1. Separate the target code from its collaborators.
  2. Find and create common data — values and entities.
  3. Extract the entities into the data component.
  4. Extract the code into the logic component.

However, there are two big changes. Previously we extracted arbitrary chunks. Now we are going to split the code into Commands and Queries. Additionally, we will take different actions when breaking out foreign code that is called from within component code. This month’s recipe will add nuance, taking slightly different actions when the foreign code is a Command vs a Query.

Merge

At this point we have all the right code in the component. However, its logic and structure is still tightly integrated with the rest of the product. Our component exposes a broad, chatty, fine-grained API consisting of every block we extracted.

We want to merge these blocks together into simpler designs. Each module should make fewer calls, and each call should do more stuff. That will make it easier to refactor our component and the rest of the application independently — and towards different designs.

Merging will change call order. It is not a refactoring. The original call sequence will generally alternate between executing chunks of component code with chunks of foreign code. If we want to collect the component code, we will need to execute all the component code at once, either before or after the foreign code.

Generally, changing call sequence is not safe. However, there are safe ways to do it if each of the blocks is either a Command or a Query. The recipes include a safe approach for each combination of command and query. These use a combination of techniques from Functional Programming, Tell-Don’t-Ask, and Pub/Sub.

We will create Chunky APIs by changing the execution sequence and merging chunks.

De-Isolate

Our component now interacts with each application module through a course-grained API. But those chunks are all isolated from each other. It is hard to share data or create a unified design.

We want to merge those APIs to make a cohesive design. Our component’s API will probably change to a different structure than the modules in the main product. We need to ensure that this new structure remains easy to use from the main application’s existing, well-isolated module structure.

I’m not providing a recipe for this step because each codebase would require a different recipe, although the Slack channel is available for brainstorming. However, there one common element for which I do provide a recipe, which is about sharing lifecycle and references across isolated application modules.

back to top