Gather a Scattered Component

Find, Gather, Merge, and De-Isolate

Problem and Solution Overview

This recipe just states how to execute a solution. Read our Legacy Newsletter blog post: DevOps #5 – Gather a Scattered Component to understand the specific problem we are solving and the solution approach.

This recipe helps you collect the scattered code that belongs in your single-purpose component so that you can edit independently from other teams.

  1. Find
  2. Gather
  3. Merge – Pick Your Approach
  4. Merge – Eliminate Sequence Concerns
    1. Prepare by Inlining Methods
    2. Clean the Commands
    3. Clean the Queries
    4. Collect Component Code to the Middle
    5. How to Swap a Pair of Blocks
    6. Moving a Foreign Command to be After a Component Command
    7. Moving a Component Query to be After a Foreign Query
    8. Moving a Command to be After a Query
    9. Extract Collected Code
  5. De-Isolate
    1. Use Backward References
    2. Solve Lifetime Control
    3. Specific Concerns

Finding Code That Belongs in Your Component

Previously we covered the most efficient way to find code related to your component — use an active story. That lets you incrementally discover the components you need and the most essential code and data for those components. However, this approach can only detect code that is actively changing.

Now we will examine strategies that can find code that is not activley changing. These are less-efficient but still incremental. Here are some examples:

  • Follow data flow to find related code.
  • Follow naming patterns and similarities to find related code.
  • Follow usages from component code to find related data.
  • Follow parameter and return types to find related data.

If you need help finding related code, please ask on the Code by Refactoring Slack Community. We can discuss these techniques or identify more. Please also join if you want to share techniques you have found to work well.

Gathering Component Code and Breaking Its Dependencies

This step is well-covered in the recipes for Gathering Component Code. However, later steps of this recipe will require more precise specification of data flow and program execution timing across the component boundary. That requires enhancing one of the prior recipes.

If code in the component calls code outside the component, we previously guided you to extract the foreign code and convert the call to use an event. This month, we ask for different actions depending on whether the foreign code is a Command, a Query, or both.

When code that belongs in the component is entwined with code that belongs outside the component, we now:

  1. Extract method on the code chunks. Ensure each method either belongs entirely inside or entirely outside the component.
  2. Examine each method call from a component method to a non-component method. Determine whether it is a Command, Query, or Both.
    • If Command: convert the method call to an event.
    • If Query: convert the method call to a function-valued parameter.
    • If Both: convert the method call to an event, but parameterize with all needed data from inside the component — make sure they don’t have to call in for anything.

Merging Fragments into an API

Pick your approach based on the call sequence. Look at the call stacks or sequences in the application that call into your component. Use the template method approach if the following are true:

  • The call stack executes a lot of the component’s code in sequence.
  • It may occasionally execute non-component Commands (via an event), but it does not call non-component Queries at any point during the sequence.

Otherwise use the sequence-changing approach.

Approach 1: Create template methods

Simply extract a template method for the calls to the component code. You may need to inline some existing methods to make this possible. Ensure all outbound calls from the component are to Commands, via events.

If you need specific help, reach out on the Code by Refactoring Slack channel.

Approach 2: Eliminate sequencing concerns

This recipe follows the following overall sequence:

  1. Prepare by inlining methods.
  2. Clean Commands and Clean Queries. Clean both component and foreign blocks.
  3. Collect component code to the middle by iteratively Swapping block order.
  4. Extract collected code.

Prepare by Inlining Methods

The section of call stack you intent to extract has several calls to component methods (blocks) and may also contain calls to extracted non-component blocks. It may also contain 2 kinds of obstacles:

  1. Calls or returns up and down the call stack that happen between blocks you wish to merge.
  2. Sequences of non-component code between calls to blocks.

Eliminate them as follows:

  1. Extract Method every piece of code inside the methods in the call stack you with to merge.
    • Try to extract control flow to be contained inside blocks. Leave each method as a straight-line sequence of method calls.
  2. Inline Method any methods that contain blocks and are called from other methods within your target merge scope.
    • Don’t worry about good design. Just inline the relevant methods.
    • It is OK to create duplicate code if necessary. This can happen if there is an inner method that is used in multiple outer methods and you wish to extract each of them cleanly. Inline the inner method multiple times.
    • For example:
def outer_1():
  block_1()
  inner_1()
  component_1()
  block_2()
  inner_2()
  component_2()

def outer_2():
  block_1()
  block_5()
  inner_2()
  component_6()

def inner_1():
  component_3()
  block_3()
  component_4()

def inner_2():
  block_4()
  component_5()

# Becomes

def outer():
  block_1()
  component_3()
  block_3()
  component_4()
  component_1()
  block_2()
  block_4()
  component_5()
  component_2()

def outer_2():
  block_1()
  block_5()
  block_4()
  component_5()
  component_6()
  1. Pick one called method in the section you are going to merge.
    • This could be a component method or a foreign one.
  2. Identify whether it is a Command or a Query. Rename method to include that fact in the name.
  3. If it is both then split it into chunks. Put as much code as possible into each Query. You should end up with large Query blocks that are read-only and small Command blocks that are as logic-free as possible.
    • If your Command reads values from any mutable location, now is a good time to introduce parameters to split the read out of the Command. That way you can extract it into the Query. It’s not a big deal if you miss some here – you will clean them up in the Clean Commands step. Doing easy and obvious ones now can just save some time.
  4. Repeat for the next method, until all of them are clearly identified as Commands or Queries.

Clean the Commands

Clean all the commands that may need to move — both component and foreign.

First ensure all external code is separated.

This should have already been completed as part of the Gather step. Now we want to make sure we haven’t missed anything. Look for any code that does not belong in this component. If you see any, handle it as in the Gather recipe.

Now clean up any unsafe data reads:

  1. Identify each potentially-unsafe data read in the Command.
    • Reading data out of a parameter is safe, as long as it is not a reference to a field that could be written to by any other command. Immutable parameters are a good way to meet this restriction.
  2. Pick one read from an unsafe variable. We’ll refer to that variable as source_var.
  3. Observe whether source_var is set earlier in this method.
    • If so: A. Introduce Variable at the point of reading. We’ll refer to this new variable as temp_var. B. Set the value for temp_var at each write point for source_var. C. If temp_var is written from reading a different unsafe variable, make sure that other unsafe var is on the list you made at step 1.
    • Otherwise: A. Introduce Parameter to move the read up to the callers. B. Check each caller. See if that is a Command or a template method that calls Commands and Queries. C. If Command: add it to the list of Commands to clean. D. If template method: extract the read of source_var into a new Query or add it to an existing Query.
  4. Commit. Go back to step 2 if there are more unsafe reads on your list.
  5. Merge.

Clean the Queries

Ensure the query doesn’t contain any events, calls to commands, or writes to widely-shared variables. If you find any, split the code into 2 queries and a command:

  1. Extract Method the (Query) code before the side-effect. Extract Method the side-effect to make a Command. Extract Method all the code after the side-effect.
  2. If the Command is foreign code, convert it to an event.
  3. Commit.
  4. Clean the Command.
  5. Inline Method the entire method (which now consists of 2 Queries and a Command/event).
  6. Commit & merge.

Convert the Query to pure functional code. Create a single type to hold its result.

  1. Check if the Query returns any data by setting values for someone to check later. Skip this part if there are no “returns by side effect”.
    • For example, it might set values on fields of shared objects, in out params, or in properties on params.
  2. If there is a return value, encapsulate it in a new class. The new class has a single property, which contains the old result.
    • Otherwise you’ll do the recipe as follows, but the first iteration you’ll create the new class instead of just adding one property to it.
    • Just name the return class Applesauce for now. We will find a good name later, once we have more information about the data it contains.
  3. Identify one return by side effect.
  4. At the point of write, also write to a non-existent property on the return type. Give the property a name based on the original play you were writing.
  5. Auto-generate a property on the return type to make the write compile.
  6. Commit.
  7. Find usages of the original write location. Pick one.
  8. Update it to read the property on the return value instead of the other location.
    • This requires data analysis and is risky. Mark it accordingly in the commit log and take care.
  9. Test & Commit.
  10. Repeat from step 7 until none remain for this call stack.
  • Remember to check commands & queries that will be invoked later in the same top-level template method.
  • This is also dangerous. It is easy to miss reads. Be thorough in your analysis.
  1. Merge.
  2. Remove the write to the return-by-side-effect location. You will now only write to your new property on the return value.
  3. Test, commit, and merge. This step will expose errors in previous steps. Take care and be ready to revert, fix the error, and try again.
  4. Repeat from step 3 until no returns-by-side-effect remain.
  5. Examine the properties in the Applesauce class. Identify what this represents in your domain and rename it.
  6. Commit & merge.

Collect Component Code to the Middle

This step is not a refactoring. We are remodeling. We intend for the code to do the same thing, but we can’t guarantee it.

The goal is to end up with this final call order:

  1. Foreign Queries.
  2. Component Queries.
  3. Component Commands.
    • Includes events that fire some foreign Commands.
  4. Foreign Commands (direct calls).

Get there by swapping one pair of blocks at a time. Commit after each swap.

How to Swap a Pair of Blocks

The approach depends on each block’s kind — Query or Command. There are 4 combinations:

  • Command then Command.
  • Command then Query.
  • Query then Command.
  • Query then Query.

Not all of these are useful in order to gain the above order. Removing the useless options, we only need to be able to change:

  • Foreign Command then Component Command.
  • Command then Query.
  • Component Query then Foreign Query.

Moving a Foreign Command to be After a Component Command

Neither reads any data, so they can’t have a data conflict. Specifically, the foreign command never calls back into the component for any data — the component gives it everything upon invocation.

It doesn’t matter what order data writes happen in unless some other code is reading. We need to prevent such reads.

  1. If the application is strongly multi-threaded, then acquire a lock for the entire execution of all the commands.
  2. Don’t allow foreign code to register any Query on an event handler (or call it from such an event) during the time when we are swapping our call order around. We can resume allowing arbitrary code on events after we have finished our component extraction.

With those restrictions, you can simply swap the order of the Commands.

Moving a Component Query to be After a Foreign Query

Just swap them. Neither modifies data so order is irrelevant.

Moving a Command to be After a Query

  1. Make a list of every non-local field or other data storage written to by the Command.
  2. Pick one and find usages. Identify any read usages in the Query (including code it calls).
  3. Pick one read from the field.
  4. Introduce Parameter, so that the code now reads from the parameter and the parameter is set from the field by the caller.
  5. Identify the data source that the Command uses to write to the field.
    • This will be a parameter to the Command. If not, then clean the Command until it is.
  6. Change the parameter assignment in the Query call to be set directly from the data source you identified in step 5.
  7. Commit.
  8. Update all other read usages from the same field in this Query to also use the parameter.
  9. Commit.
  10. Repeat from step 2 until you finish the entire list you created at step 1.
  11. Swap the order of the Command and Query.
  12. Commit.

Extract Collected Code

At this point the collected code will be a simple template method. Extract method to pull it out.

De-Isolate Component Code

No matter what design you use for your component, you will need to solve the reference and lifetime problem. Most components will have several instances live at a time, and the foreign code needs some way to refer to the correct instance when they make an API call. Additionally, you need some way to know when to clean up an instance.

Your codebase is going to have too many components for forward references to be feasible. Forward references would mean that the application keeps one object reference per component. With hundreds of components, that means the application needs to keep a massive directory of these objects and refer to that directory object everywhere. It becomes a bottleneck.

Even worse, components will depend on each other, so the application code would have to pass these component objects in to other components. The dependency snarl will continue to grow until it either clogs everything or pulls in a complicated Dependency Inversion container and library.

There is a simpler approach: use backward references.

Use Backward References

Each component deals frequently with some one type from the application (or another component). With backward references, the component associates an instance of itself with each instance of that other type. The component keeps a dictionary mapping the foreign instance to the component instance and looks it up as needed.

Now calling code can use the component naturally. It just passes in the objects that the component needs for data, and everything just works. That’s true whether the component is called from the main application or from another component.

This also makes it trivial to integrate your component. The component uses the same dictionary, no matter which module it is called from. The various modules pass around the same data objects, so the component can easily find the correct instance for any API call. You are free to use a different module structure in your component’s design.

Reach out on the Code by Refactoring Slack Community if you want help setting up or using backward references.

However, backward references create a new challenge: lifetime control.

Solve Lifetime Control

You need to know when a non-component object goes out of scope so that you can clean up the corresponding component instances. There are several general techniques, depending on your language capabilities.

  1. Use deterministic finalization with an event.
  2. Use a field that contains a bag of disposable objects, which call the finalization methods.
  3. Hook an existing lifetime event.

Each of these allows you to effectively register a cleanup method invocation. That cleanup method removes the corresponding component instance from the dictionary. Each of these designs also share the following advantages:

  • The foreign code doesn’t need to specifically call the handlers. The regular instance deletion / garbage collection process takes care of it.
  • There is no custom code; you can write any of these as a pattern and then mix it in to any type which turns out to be a component key.
  • The foreign object does not need to change when we change the number of components that use it as a key.

Discuss implementation details for any of these lifetime strategies on the Code by Refactoring Slack Community.

Specific Concerns

Each component will have a different optimal design, so each will follow a different recipe to get there. If collaboration would help you find the path to your design, discuss strategies on the Code by Refactoring Slack Community. A wide variety of tools are also available as part of the Code by Refactoring Advanced Workshops.