Abridged Version History

Missing History

From 2017.06.28 onward, this Fossil repository has the full checkin-by-checkin history of the MySQL++ project, plus material that never was available in the old Subversion repository, such as wiki pages like this one.

Prior to that date, the repository contains every released version of MySQL++ from the project's founding in 1998 with version up until the 3.2.3 release made at the end of 2016. Through the rest of this article, I will be making a conscious distinction between checkin-level vs release-level history.

There is one exception: the version 1.7.33 tarball I have archived is damaged, so I elected not to check that partial release into this Fossil repository, causing the revision history to skip from version 1.7.32 to 1.7.34.

That release-level history prior to 2017 was made by unpacking each release tarball in turn, removing all generated files, then checking in the differences relative to the prior tarball. I have archived the Bash script used to do this conversion along with its associated "ignore" rules for those who want to see how this was done.

This means we are also missing all of the checkin comments from the old Subversion repository. Fortunately, we still have the file, which was distilled from that comment stream. It also contains attributions for features not made by the project's current maintainer.

If you only want a summary of who did what in MySQL++, see the CREDITS.txt file.

What Happened?

I went through this process out of necessity: our prior version control host ceased operations without notifying me via any channel I pay attention to. Since we were using Subversion prior to Fossil, that left me with only the release tarballs and a few version checkouts from which to reconstruct the project history.

The Value of History

If I'd chosen to migrate to Fossil before losing access to the Subversion repository and made a checkin-by-checkin conversion instead of this present release-level conversion, the Fossil repository would simply be larger, not appreciably more useful.

MySQL++ long since became stable enough that old versions are of very little practical interest. Current versions build and work on pretty much every major OS released in the past 10 years or more, and older OSes often have a contemporary MySQL++ package built for them. There simply is very little call for digging into past versions via this version control system.

Since Fossil is a distributed version control system, this means every clone pulls all checked-in versions, but very few people would actually dig back into it.

Having confronted that reality, the question then came down to how much history to try and preserve.

I considered checking in only a handful of epochal versions and leaving it at that. Say, 1.7.9, 1.7.40, plus a selection of 2.x and 3.x releases showing off major stages of the library's development, capped by the current release at the time of conversion, 3.2.3. I rejected this because in all the time of watching the project's mailing list, I did not often see people going back to one of these important historical versions.

I therefore decided that what was most important was to scatter the project's complete release history far and wide, so as to avoid a calamity like this again. I wanted everyone who cloned this Fossil repository to have the ability to replace the current project home page if I disappeared like Gna did. That's the beauty of DVCSes: we have plenty of other examples of source code repository hosts disappearing, necrotizing, or becoming evil.

So Why Is It Smaller?

As of this writing, the size of the cloned repository size is about the same size as the latest release tarball: about 4 MiB. How can this be if the repository contains every released version? There are a couple of major sources of savings.

File Removal

The primary reason for the size savings is that the conversion script removed all the files that can be generated from another file also checked in. For the most part, these generated files are included in release tarballs only as a convenience to the end user of the MySQL++ library, since generating them is a rather involved process, requiring many tools that aren't always easily available.

In rough order of size, the sources of bloat in the release tarballs relative to this Fossil repository are:

See the ignore rule list if you want to know what all is purposely missing from the Fossil repository as compared to the release tarballs.

There are two processes that generate most of the above: the the bootstrap process and make doc. End users of the library are not expected to be able to do either easily, whereas we do burden users of the Fossil repository with that.


There is a second reason for the svelte Fossil clone size: Fossil employs a 2-level compression scheme that ruthlessly squishes out redundancy between checkins.

The first level is delta compression, which stores only the differences between each file that exists in each sibling checkin pair. If a file only had a few characters added to it between releases, only those few characters and a bit of metadata is stored in the repository.

On top of that, Fossil uses gzip compression for each stored artifact. With a primarily-text based repository like this one, that can result in huge savings.

As of this writing, this Fossil repository enjoys an overall 11:1 compression ratio as compared to checking out each version on disk separately.