Extending 3rd-party Open Source Projects
How software developers can extend 3rd-party open source software while sidestepping maintainability and quality-assurance pitfalls.
Have you ever found an open-source tool or project that almost solved a problem for you, but fell short of being exactly what you needed? Since you have and can modify the source code, it's obviously possible to just extend the system to have the missing functionality. While not always the best choice, at times this approach can save a lot of time and effort compared to coding a custom solution from scratch.
Doing this is harder than it might seem at first. There are two classes of potential pitfalls. First is maintainability: when new releases come from upstream, how can you upgrade while still preserving your modifications and enhancements? Second is quality. When modifying a code base you are intimiately familiar with, it's relatively easy to add features without introducing bugs. How can you be confident of the program's correctness as you make changes if you are changing code you don't know much about?
How you manage the code can make the process far, far easier. Here are some recommendations:
- Keep a version control repository of the code. Make the first commit the original, unmodified source that you have downloaded.
- If the software contains its own test suite, find out how to run all the tests, and create a small script or tool that will easily run all of them. Check this in; it will be your second commit.
- Aim to make small to medium sized commits that contain just one feature. Splitting a feature over several commits is fine; avoid packing more than one feature into one commit.
- Decide whether you are going to donate your changes to the original project, or plan to port your changes if and when you upgrade to a future version of the software.
- As you continue to make modifications, make sure the tests pass; automate test running using a continuous integration system, if possible. Of course, it's a good idea to write unit tests specific to the extensions you add, and include those in the full test suite.
By doing the above, you will make several things easier in the future. First, by running the tests, you obviously have some assurance that your modifications are not inadvertently breaking anything.
Second, the code implementing your new features will be naturally organized by commit in version control, with the commit messages providing documentation. This helps if sometime later you want to upgrade to a new version of the project. Doing so requires that your changes be re-applied to the new version's code. This can be anywhere from easy to impractical, depending on exactly how the upstream code has changed in the new release. Having the code for your new features organized in this way will ensure it's no harder than it has to be.
Depending on the nature of your additions, and whether the original project maintainers are receptive, you may want to lobby to get your code integrated into the product. There are great benefits to you for this; a big one is that, in the best case, the people who know the project best accept responsibility for maintaining your features as the product changes over time. Again, the organization of the code implementing your features helps you prepare good patches that are easier for upstream to integrate, and and thus more likely to accept.