What’s better than semver?

Many software libraries are released with version “numbers” that follow a scheme called Semantic Versioning. A semantic version is three numbers separated by dots, of the form x.y.z, where:

  • if x is zero, all bets are off. Otherwise;
  • z increments “if only backwards compatible bug fixes are introduced. A bug fix is defined as an internal change that fixes incorrect behavior.”

Problem one: there is no such thing as an “internal change that fixes incorrect behavior” that is “backwards compatible”. If a library has a function f() in its public API, I could be relying on any observable behaviour of f() (potentially but pathologically including its running time or memory use, but here I’ll only consider return values or environment changes for given inputs).

If they “fix” “incorrect” behaviour, the library maintainers may have broken the package for me. I would need a comprehensive collection of contract or integration tests to know that I can still use version x.y.z' if version x.y.z was working for me. This is the worst situation, because the API looks like it hasn’t changed: all of the places where I call functions or create objects still do something, they just might not do the right thing any more.

Problem two: as I relaxed the dependency on running time or memory use, a refactoring could represent a non-breaking change. Semver has nowhere to record truly backwards compatible changes, because bugfixes are erroneously considered backwards compatible

  • y increments “if new, backwards compatible functionality is introduced to the public API”.

This is fine. I get new stuff that I’m not (currently) using, but you haven’t broken anything I do use.

Problem three: an increment to y “MAY include patch level changes”. So I can’t just quietly take in the new functionality and decide whether I need it on my own time, because the library maintainers have rolled in all of their supposedly-backwards-compatible-but-not-really changes so I still don’t know whether this version works for me.

  • x increments “if any backwards incompatible changes are introduced to the public API”.

Problem four: I’m not looking at the same library any more. It has the same name, but it could be completely rewritten, have any number of internal behaviour changes, and any number of external interface changes. It might not do what I want any more, or might do it in a way that doesn’t suit the needs of my application.

On the plus side

The dots are fine. I’m happy with the dots. Please do not feel the need to leave a comment if you are unhappy with the dots or can come up with some contrived reason why “dots are harmful”, as I don’t care.

Better: meaningful versioning

I would prefer to use a version scheme that looks like z.w.y:

  • y has the meaning it does in semver, except that it MUST NOT include patch level changes. If a package maintainer has added new things or deprecated (but not removed) old things, then I can use the package still.
  • z has the meaning it does in semver, except that we stop pretending that bug fixes can be backwards compatible.
  • w is incremented if non-behavioural changes are implemented; for example if internals are refactored, caches are introduced or removed, or private data structures are changed. These are changes that probably mean I can use the package still, but if I needed particular performance attributes from the library then it is on me to discover whether the new version still meets my needs.

There is no room for x in this scheme. If a maintainer wants to write a new, incompatible library, they can use a new name.

Different: don’t use versions

This is more work for me, but less work for the package maintainer. If they are maintaining a change log (which they are, as they are using version control) and perhaps a medium for announcing important changes including security and bug fixes and new features, then I can pick the commit that I discover does what I need. I can maintain my own tree (and should be anyway, in case the maintainer decides to delete their upstream repo) and can cheery pick the changes that are useful for me, leaving out the ones that are harmful for me.

This is more work for me than the z.w.y scheme because now I have to understand the impact of each change. It is the same amount of work as the semver x.y.z scheme, because then I had to understand the impact of each change too, as changes to any of the three version component could potentially include supposedly-backwards-compatible-but-not-really changes.

About Graham

I make it faster and easier for you to create high-quality code.
This entry was posted in software-engineering. Bookmark the permalink.

4 Responses to What’s better than semver?

  1. Nick Lockwood says:

    ‪This is interesting, but AFAICT z.w.y provides no mechanism for ever removing deprecated functionality, which to my mind is the main purpose of x releases (not creating an essentially new library with the same name)‬.

    ‪Another possible problem is that this approach doesn’t easily allow for error correction. In semver if you accidentally ship a breaking change in a point release, you can ship another point release that unbreaks it, and be reasonably sure that anyone who would have updated to the bad release will also update to the fixed version.

    But in thr z.w.y scheme, if you accidentally broke something in a w or y release, you’d have to ship a z release to fix it (unless you compound the error by deliberately making another incorrectly labeled breaking change). But a user with a conservative update policy might then be stuck with the broken version forever.

  2. Leo Zhang says:

    What are examples of cases when incrementing z would give useful information? It seems like all useful types of changes are not backwards-compatible by your definition: security updates, performance improvements, bug fixes, etc. As the consumer of a package, I’m not really interested in updates that only refactor code without changing anything because I don’t maintain the package code.

  3. Graham says:

    The intended meaning of z is “we’re doing the same thing as before, in a way that we think is better, we think you might want it but we might have broken something for you”. You could imagine that in a library with a well-documented contract, going from “this method returns a Foo” to “this method returns this specific subclass of Foo” is a behavioural change that probably nobody is worried about. Going from “we returned an unconfigured Foo in this edge case” to “we throw an InsufficientInformationException in this edge case” is a behavioural change that probably somebody should be worried about. Many libraries do not have well-documented contracts so in reality a z change just means “whatever we’ve changed, your client code still compiles/still doesn’t TypeError”.

    The intended meaning of x is “we’ve removed a thing you needed, changed how it works, or changed what it does” which is a broader church than z changes. So because the z changes are a narrower collection of changes (that I still need to worry about), they’re easier to accept than x changes.

  4. Graham says:
    • Removing deprecated functionality: one has to question why a library would scope creep so much that the initial use goes out of scope. “We’ve found a better way to do this” seems like a great example of an x change: leave the existing thing alone (let it wither, stop adding y releases, whatever) and recommend the new, better thing.

    A pet gripe of mine is that whenever the Cocoa folks think of a new way to do something or a new thing to do, they add it to existing classes so that the APIs for the existing things become bloated and eventually they deprecate and remove existing, working methods rather than documenting “this is the better way to do it, we recommend this new thing from now on, and no longer document doing it with that thing”. View-based table views are a different thing from cell-based table views, but are both called NSTableView. IMO the new thing should have a different name.

    • your second problem is inconsistent. You allow that an accidental breakage in semver can be fixed by a deliberate breakage in semver, then say that you cannot fix an accidental breakage in z.w.y except by a deliberate breakage in z.w.y or an increment of z. It seems like if you can accept that in one, you should be able to accept that in the other. However, I would argue that it’s difficult to accidentally ship a breaking y change, because it should be clear whether a collection of commits are purely additive or not.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.