Plotting sparklines with XSLT

Using XSLT to generate sparklines, Wok style

← Sparkling wok, episode 3 | | Sparkling wok, episode 4 →

As promised in my previous article about using XSLT for plotting, and as anticipated in the relative edit, I've been working on generating sparklines with XSLT. And by sparklines this time I mean actual sparklines, not the Unicode mockups I've already introduced to this site: I mean high-resolution (vector!) plots like this representation of (as usual) the language distribution over the years.

In contrast to my previous XSLT plotting efforts, this time I want the results to be flexible enough to use to create sparklines about anything: this means in particular no hard-coding of the “plotting keys”, and provide ways for the user to customize the plot (at least to some degree). This does result in a larger XSLT stylestheet, which would be counter-productive in its use to plot a single sparkline (in terms of economy of space and bandwidth), but quickly amortizes over multiple sparklines, for example when showing the sparkline for all languages as well as the ones for Italian and Latin individually, or the combined Italian/English sparkline —which I'm doing here only to showcase some of the functionality of this stylesheet: the possibility to select how many and which lines to plot (with automatic scaling based on the maximum of the given series, as clearly noticeable by the Latin sparkline whose peak is essentially invisible in the “everything” sparkline ), and the possibility to select the colors (if your UA is set to prefer dark mode, you'll notice that the lines in the combined Italian/English sparkline are lighter than the corresponding ones in the “everything” sparkline ).

To get an idea about the convenience (or not) of this approach, let's have a look at some numbers. At the time of writing, the XSL timesheet is already above 21KiB. The most complex sparkline is barely more than 10KiB. The other sparklines are even smaller, since the size clearly depends on the number of lines in the plot (and some choices such as whether or not to draw points at null values). However, all the sparklines presented here together add up to around 35KiB (or at best around 20% less when omitting null value points). Even considering that all of them being derived from the same dataset is an exception rather than the rule, we can see how quickly the (byte) cost of the XSLT gets amortized (and that's before putting any effort into minimizing the XSLT size).

The next step will be to introduce proper sparklines to replace the pseudo-sparklines based on Unicode blocks currently shown under each index page (here's for example the root index, and the one for the tech column) and later enhanced with some metadata popup.

Moving from the textual pseudo-sparklines to true sparklines will be a big change.

For example, one thing pseudo-sparklines can do, but true sparklines won't, is to wrap around when they are too long: if you visit any of my index pages from a mobile phone, for example, you'll probably see the pseudo-sparkline take something between 5 and 10 lines. This was never intentional, but it's an interesting side-effect of the textual description of the sparkline. In graphic form, the sparkline will (at most) fill the whole line, and grow/shrink based on available screen estate. I am not entirely convinced this will be a superior choice, but for sure it'll truer to the spirit of the object. It will also mean that I won't have to worry about screen estate to show both the commits and dates sparklines, even in the same plot, and I will be able to add the per-column sparklines at the root of the Wok.

It will also be interesting to see how much space usage will change. Currently, the auto-generated pseudo-sparklines take around 30KiB each, and all 11 of them together (10 columns + 1 root index page) means upwards of 340KiB extra , integrated directly into the index pages because static HTML pages do not have a way to include external HTML fragments (although this is actually possible using an XML dialect of HTML, and an appropriate XSLT stylesheet).

Moreover, the actual content of each index page changes whenever a sparkline changes, and since the “sparkline update” runs unconditionally, this means that all index pages (even those that would be unaffected) are regenerated each time I publish anything anywhere.

So, in the current setup, index pages are 30KiB larger than they need to be, and are regenerated more often than they need to be. True sparklines, on the other hand, would simply be included via an unchanged object embedded in the page —and the only thing that would change is the XML data loaded by the SVG skeleton that will produce the sparkline via the XSLT stylesheet.

All in all, I expect the change to provide more visual information (since it will be possible to visualize both the commits and dates timelines) with less data both on disk and on the wire.

The reason I don't actually know yet is that moving “up” from the languages-per-year plots I've been working lately is not as trivial as I would like it to be. The biggest challenge will be the switch from yearly to monthly data.

Since browser development is in the hands of people that apparently despise XML, all of them are stuck on 1999 tech even though XSLT has had some extremely significant improvements in the following 20+ years. Among the things that I would have available if I could use more modern XSLT, but cannot because browser development is controlled by user-hostile companies, are date/time manipulation functions.

So I'll have to roll my own, which will be time-consuming (although I will limit myself to what I actually need) and lead to an unnecessary growth in size for the stylesheet. (This, for anyone who's counting, isn't a downside of using XSLT, but a downside of browser developers refusing to move forward with more modern versions of it like they have instead done with all other web tech.) And I still expect the combined XSLT plus XML data to weight less than the rendered SVG sparklines: we're talking about over 300 data points for the smallest index sparkline already, and over 400 for the longer ones, with a guarantee for growth, after all, which is an order of magnitude larger than the data points in the simple sparklines I'm showing here.

Despite the uncertainty for what's to come, I felt it was important to push this update: seeing those first sparklines pop out of the page has been one of the most satisfactory moments in my recent life. Truly a Frankenstein (Frankensteen) moment.