1 — Introduce and Test New Functionality
Identify the problem before assuming solutions.
less than a minute
Any data transformation application consists of two parts:
The boundary is much more difficult to test, because it involves interacting with other systems.
From another perspective, we can think of the job of a data transformation application in two parts:
Parsing, understanding, and cleaning data is much more difficult to test, simply because of the wide variety of possible input variations and messy data.
The key to keeping a data transformation application simple is to keep the hard parts separate. In other words, we need to ensure that the boundary code only loads and stores opaque data. All parsing, understanding, or cleaning data happens only in the core.
This module focuses on the core; the next module will focus on the boundary.
Implementing all the parsing, understanding, and data cleaning logic in the core requires two techniques: stick-figure testing and the data pipeline design.
Identify the problem before assuming solutions.
Create design now that code is written.
Use stick-figure testing to modify existing code.
Use a data pipeline to make transforms easy.