Port Between Technologies

Creating and Shipping a Chimera

Problem and Solution Overview

This recipe just states how to execute a solution. Read our Legacy Newsletter Edition: Port Between Technologies (30 July) to understand the specific problem we are solving and the solution approach.

We are going to set up a tame chimera system: something which is simultaneously both the old and the new, but in a way that won’t eat us. To accomplish that, we build the following:

Phase 1: Pick Product Strategy

A port often arises at an inflection point for a product. The system has been stagnant for a long time, and could be in one of two different states. We need to assess which state we are in and then execute the corresponding strategy.

To Revitalize

Use When Your Market-leading Product is Slowing Down

We will only ever have one product. It will start increasing in quality over time. This improvement will be slow at first and then attain a steady cadence.

Build one product that contains two technologies and shift code incrementally.

What to Expect
The product will internally and gradually shift from 100% old tech to 100% new tech, 1% at a time. It ships continuously during this time. Any parts of a feature implemented in new-tech code will be cheaper than those features had been, so we will start seeing revitalization benefits from very early in the process.

To Cannibalize

Use When Your Market is Being Replaced by New Approaches

We will build an old product and a new product, shipping both constantly. The new product will slowly erode market position from the old.

Build two products at once off of a single codebase. The old one uses a mix of old and new technologies and the new one is new-tech only.

What to Expect
The new product starts out empty, with no features. We can easily include and modify functionality from the old product to be in both products, but this doesn’t happen much with features core to the new product. It tends to happen with functionality such as connecting to other existing systems.

The phases are the same for each type of project, except that revitalization projects skip phase 5.

Phase 2: Call Across the Tech Boundary

Most technology pairs will need to reside in separate processes. However, you might be able to find in-process techniques. Prefer those when possible. You will need the cross-process approach for crossing language versions or for crossing program execution locations.

For example, in moving from CodeFusion to C#, we first moved from ColdFusion to ColdFusion.Net, so we could run both CF and C# code in the same .Net VM. The same approach works with many Java migrations, using JRuby, Jython, etc.

Single Process Recipe

  1. Create a trivial program in the new language, within the same runtime process. Commit.
  2. Spike each available in-process calling technology for calls in each direction. Create one function in the new tech, call it from the old, and have it call something in the old. Options include:
    • COM
    • Underlying VM language calls (.Net, JVM, LLVM).
    • Cross-language calling libraries in new & old technologies (stdimport, dllimport, c import).
  3. Commit the winning spike.
  4. Extend your calling technology to support callbacks (function pointers, interfaces, or delegates / events). Commit.

Cross-Process Recipe

  1. Create an infinite-loop program in the new language, and run it as a separate process. Commit it.
  2. Spike the cross-process function call options for calls in each direction. Create one function in the new tech, call it from the old, and have it call something in the old. Options include:
    • COM
    • HTTP / REST / JSON
    • Message pass
    • Socket (use only if others fail – this will require a lot more infrastructure work)
  3. Commit the winning spike.
  4. Create an init program that starts the old and the new and tells each the endpoint for the other. Commit.

A Note on Performance

Don’t worry about it.

These function calls will likely be slower. But you will be making other changes that speed it up. Predictions are useless for performance problems, and therefore so is advance planning and worrying. Instead, measure your current state and measure it again periodically.

Most projects discover that 99% of the changes have no real performance impact, simply because they are not on the critical path. Use measurement to distinguish when you just changed that 1% of code that has a real impact, and then fix it. Usually the fix is to move a larger chunk of code between technologies at once. This is, obviously, something we want to do rarely. But it will happen a couple times during the project. Just treat it as the exception, not the norm.

Phase 3: One System With Two Technologies

Modify your existing shipping process to include the new technology. Commit multiple times for each step:

  1. Modify your existing product to optionally compile with your function-call spike.
    • Use something like #ifdefs to conditionally compile in the calls to the second part.
    • Add one UI component or action to trigger a cross-technology call and a way to verify that it happened (like posting 2 message boxes).
  2. Duplicate your local build to be able to build your product either as before or with the new tech attached.
  3. Duplicate the build step in your CI pipeline to build both versions.
  4. Update the testing step to use the version that includes the new tech.
  5. Update distribution packaging to use the version that includes the new tech.
  6. Smoke test the multi-technology version. Do the action that triggers the cross-tech calls and make sure it all works.
  7. Delete the CI build step that builds without the new technology.
  8. Delete the local build that builds without the new technology.
  9. Remove conditional comments and clean up all vestiges of building without the new technology.

Congrats! At this point your legacy product is a minimal chimera. You can ship it as normal and include code in both technologies.

Phase 4: Compose Elements

Most technology pairs will have some sort of components. You will want to nest or host these components in each other, even across the technology barrier.

For example, when moving from a desktop app to a cloud-delivered one, one step is to migrate the UI. You need a way to, minimally, host some HTML components inside of your old tech. The prior steps allowed your to shift functionality that drives the UI, but you also need to shift the UI itself.

  1. Identify all the components you will need to have cross technologies.
  2. For each, run spikes to find ways to host the new tech in the old. Your spike only needs to handle the simplest possible component (like a static text block). Commit the winning spike.
  3. Try spikes to host the old tech in the new. This may not be possible, but your life will be easier if you can find a solution. Time-box your spikes and commit any success.
  4. Update your real product to move one instance of the simplest component to the new technology. Time-box yourself to 1 hour, and commit.
  5. Smoke test the result.

Phase 5: Two Delivery Pipelines

If you are pursuing the cannibalize strategy, set up your second product now. Skip this phase for the revitalize strategy.

To create your second pipeline:

  1. Create a new local build that builds a product from old + new technology, but only takes the files designated for the new system. Don’t designate any files yet.
  2. Verify that it creates a no-op product, and commit.
  3. Set up CI around that build, including testing and packaging. Commit.
  4. Now mark just the files involved in your function-call spike as for both systems.
  5. Smoke test the result and commit.
  6. Implement a guard in your old product pipeline that allows you to exclude functionality that is supposed to be new-product only. Commit.
  7. Add a trivial pseudo-feature in the new technology and mark it as new product only. Verify that it appears in the new product and not in the old. Commit.
  8. Implement a mark that you can use to indicate that a set of functionality is old-system-only. This will not be used in your pipelines, but shows other developers that you have already considered that piece of the old tech and chosen not to bring it forward.

At this point you can implement 3 kinds of features:

  1. Old-product only: implement it in the old technology and do not designate it for inclusion in the old product.
  2. New-product only: implement it in the new technology and mark it as new-product only.
  3. Both products: port it from the old technology to the new and mark it as for both products. It will commonly link to the rest of the product in two different ways. Those two links are 1 each of old-product only and new-product only.

Phase 6: Distinguish Code Chunks as New, Old, or Hybrid

You have a lot of code and it is easy to get lost. You can tell what technology a chunk of code is using when you are updating it. However, you need to be able to know what code is in which tech without having to go read it all. You need a tool that can classify your code so that you can get higher-level views.

To create this tool:

  1. Pick your resolution: what constitutes a single “chunk of code?”
  2. Create a small script that can classify a chunk of code as “new tech,” “old tech,” “mixed,” or “unknown.” Commit it.
  3. Create a script that chooses a signature for a chunk of code – some identifying info that allows you to unambiguously find the code. For example, you might use the fully-qualified name of the chunk. This signature should be stable unless the code itself is changed. Optimally, the signature is stable even when the code shifts from the old tech to the new.
  4. Create a script that can find code chunks in the easiest part of your codebase. Commit it.
  5. Wrap the two together. Commit that.
  6. Extend your script to dump results into a DB. Each record is the date, the signature for a chunk of code, and which technology it uses. Commit everything.
  7. Start running your script daily.
  8. Extend your script to cover whichever parts of your codebase you are going to work on first. Commit that.
  9. Start extending the script to handle other sections of the codebase until you have them all. Until then you need to communicate that all data is partial, so it’s good to get them all soon. But this doesn’t block progress on anything else.
  10. Consider whether it will be useful to extend the data with anything else. For example, you might associate a code chunk with set of functionality in the product or a piece of the architecture. You might also associate it with a team, though that can cause problems too.

Phase 7: Visualize Current State

Create at least 2 visualizations of the current state:

  • Progress dashboard: shows the progress of the porting project over time.
  • Drill-down dashboard: allows someone to explore the current status of a particular piece of the product.

Progress Dashboard

Use a burn-up chart. Show the amount of code – number of chunks is probably a good enough proxy – in each of the 4 categories. Add an indicator for the period before all code was analyzed, stating that those results are partial.

This dashboard is intended to answer the following questions:

  • How has our progress been to date? Are others pulling their share with me?
  • I just did some work, how much impact did it have locally and globally?
  • Should I continue to invest in this project?
  • Optional: which pieces of the product are seeing the most rapid progress right now?

A simple stacked-bar chart for each day works well as a progress dashboard. Order the data top-to-bottom as unknown, old, mixed, then new. That way your eye will draw lines across the tops of the bars, but you still get the benefits of bars for examining each day.

Optionally, you can add the ability to drill-down by code chunks or other data — whatever you want to add to your data source. You could also add a Top 20 list for rapidly-improving areas.

Drill-down Dashboard

Use an interactive tree map. Show each chunk of code, clustered in some useful way. Color each chunk by its status. Add a search by signature, optimally including partial signatures.

This dashboard is intended to answer the following questions:

  • When estimating a feature, how much old tech am I likely to hit, so that I can account for increased risk or cost?
  • When planning a release, how much old tech is in the impacted areas of the code? How shall I account for that risk? How shall I adjust feature priorities?
  • As I work on this story, what old tech is near me? I want to fix it – while communicating that intention in advance.
  • Which areas of the product have the most old tech remaining and thus the most risk? When should I schedule strategic initiatives to port some particular section of the code?

Optionally add multiple different clustering algorithms. For example, if your signatures are fully-qualified function names, and you also had multiple repositories for different components, you might allow clustering by name segments as well as clustering by repository + file path.

Interactivity is Key

Static reports will not work for your visualizations. They can be used to get a top-level status summary, but that is not actionable. To be actionable, the data must be interactive. People must be able to ask and answer their own questions by interacting with the data.

Use something like Tableau or Power BI to create these interactive visualizations. I prefer Tableau because it offers more interactivity to the final users. Whatever you use, train everyone in the organization to interact with the resulting dashboards, so they can ask and answer their own questions.