From 2017.06.28 onward, this Fossil repository has the full checkin-by-checkin history of the MySQL++ project, plus material that never was available in the old Subversion repository, such as wiki pages like this one.
Prior to that date, the repository contains every released version of MySQL++ from the project's founding in 1998 with version 0.64.1.1.a up until the 3.2.3 release made at the end of 2016. Through the rest of this article, I will be making a conscious distinction between checkin-level vs release-level history.
There is one exception: the version 1.7.33 tarball I have archived is damaged, so I elected not to check that partial release into this Fossil repository, causing the revision history to skip from version 1.7.32 to 1.7.34.
That release-level history prior to 2017 was made by unpacking each release tarball in turn, removing all generated files, then checking in the differences relative to the prior tarball. I have archived the Bash script used to do this conversion along with its associated "ignore" rules for those who want to see how this was done.
This means we are also missing all of the checkin comments from the old Subversion repository. Fortunately, we still have the
ChangeLog.md file, which was distilled from that comment stream. It also contains attributions for features not made by the project's current maintainer.
If you only want a summary of who did what in MySQL++, see the
I went through this process out of necessity: our prior version control host ceased operations without notifying me via any channel I pay attention to. Since we were using Subversion prior to Fossil, that left me with only the release tarballs and a few version checkouts from which to reconstruct the project history.
The Value of History
If I'd chosen to migrate to Fossil before losing access to the Subversion repository and made a checkin-by-checkin conversion instead of this present release-level conversion, the Fossil repository would simply be larger, not appreciably more useful.
MySQL++ long since became stable enough that old versions are of very little practical interest. Current versions build and work on pretty much every major OS released in the past 10 years or more, and older OSes often have a contemporary MySQL++ package built for them. There simply is very little call for digging into past versions via this version control system.
Since Fossil is a distributed version control system, this means every clone pulls all checked-in versions, but very few people would actually dig back into it.
Having confronted that reality, the question then came down to how much history to try and preserve.
I considered checking in only a handful of epochal versions and leaving it at that. Say, 1.7.9, 1.7.40, plus a selection of 2.x and 3.x releases showing off major stages of the library's development, capped by the current release at the time of conversion, 3.2.3. I rejected this because in all the time of watching the project's mailing list, I did not often see people going back to one of these important historical versions.
I therefore decided that what was most important was to scatter the project's complete release history far and wide, so as to avoid a calamity like this again. I wanted everyone who cloned this Fossil repository to have the ability to replace the current project home page if I disappeared like Gna did. That's the beauty of DVCSes: we have plenty of other examples of source code repository hosts disappearing, necrotizing, or becoming evil.
So Why Is It Smaller?
As of this writing, the size of the cloned repository size is about the same size as the latest release tarball: about 4 MiB. How can this be if the repository contains every released version? There are a couple of major sources of savings.
The primary reason for the size savings is that the conversion script removed all the files that can be generated from another file also checked in. For the most part, these generated files are included in release tarballs only as a convenience to the end user of the MySQL++ library, since generating them is a rather involved process, requiring many tools that aren't always easily available.
In rough order of size, the sources of bloat in the release tarballs relative to this Fossil repository are:
Documentation: The user and reference manuals are currently generated from the DocBook and Doxygen source material, respectively. The Fossil repository only has that source material checked in, not the HTML, PDF, PostScript, DVI, PNG, GIF, and other files generated from these sources.
Build System Files: The Fossil repository does not contain any of the
Makefilesor IDE project files generated from
mysql++.bklby Bakefile. Past release checkins similarly are missing the outputs from the GNU Autotools called by the contemporaneous versions of the
Generated Headers: There are a couple of Perl scripts in MySQL++ that generate header files that would be tedious to maintain by hand. This repository includes only the Perl scripts, not the headers they generate. The release tarballs contain the headers as well, so the end user of the library doesn't have to have Perl. (This is a particular bother on Windows, which is one reason we recommend bootstrapping the library on a POSIX type system.)
.in File Outputs: The build system generates several files included in release tarballs from a file with the same name plus a *
Junk: The release tarballs sometimes included files that never should have been included, such as editor temporaries, old backup files, and files trivially generated from another that we were already going to include.
See the ignore rule list if you want to know what all is purposely missing from the Fossil repository as compared to the release tarballs.
There are two processes that generate most of the above: the the bootstrap process and
make doc. End users of the library are not expected to be able to do either easily, whereas we do burden users of the Fossil repository with that.
There is a second reason for the svelte Fossil clone size: Fossil employs a 2-level compression scheme that ruthlessly squishes out redundancy between checkins.
The first level is delta compression, which stores only the differences between each file that exists in each sibling checkin pair. If a file only had a few characters added to it between releases, only those few characters and a bit of metadata is stored in the repository.
On top of that, Fossil uses gzip compression for each stored artifact. With a primarily-text based repository like this one, that can result in huge savings.
As of this writing, this Fossil repository enjoys an overall 11:1 compression ratio as compared to checking out each version on disk separately.