Ordering options by frequency with hysteresis

(Skip to the horizontal divider for the idea if you don’t care about the context.)

With the continued Twitter chaos (popcorn worthy, in a way), the ActivityPub/Mastodon ecosystem has received a ton of additional attention lately. This includes UI/UX work on both the official Mastodon web UI as well as multiple alternative clients on several platforms. I am currently using Tusky on Android, Ivory on iOS, and both the Mastodon “Advanced” UI as well as [Elk](https://elk.zone/) on the web.

One of the features Mastodon exposes prominently is tagging posts with metadata such as the posting language. Unfortunately, the multilingual use case is rarely well served with the current state these features are implemented in: Most clients let the user set exactly one default language for posting, and offer a long, usually alphabetically sorted, list to pick another option. At least there is usually type-ahead search, but it is still far from a great user experience.

Mastodon itself lets users choose several “content languages”, as in “show me content in these languages”. This setting could conceivably double as a list of languages to pin to the top of the language selector widget, but having a separate setting would be better for flexibility. It also stands to reason that most people will be able to read and understand more languages than they are fluent writing in.

Already, there is a bit of a heuristic in Mastodon mainline, namely that it will set the post language of replies to the language of the parent post, which makes sense for threads. Additionally, English is always pinned to near the top of the list, right after the user’s chosen default posting language (which makes some sense, given the widespread use of English as an international communication language, but then we should also ask about why not Mandarin, or Spanish, etc.).

One UI paradigm that is often employed in similar contexts, and one that has also come up in discussions about evolving the language selector feature, is remembering recently used languages. There are more or less sophisticated algorithms here, for example:

  • Strict ordering by recency – the currently selected language will move to the top of the list and push everything else down one slot. Simple, predictable, but a bit annoying if the user frequently posts in different languages.
  • Static ordering from a selection – Of the many languages supported in the standard, the user gets to choose a subset (e.g. with checkboxes) and those get shown first in the widget, in whatever “natural” order they are normally in (e.g. alphabetically, or by database ID) and that order does not change. The advantage is that the user never needs to scan the list visually – click positions remain the same. The disadvantage is, obviously, that the “natural” ordering may not align with the user’s language use frequency at all.
  • Static, user-defined ordering – Same as above, but the user has a way to define the order explicitly. All the advantages as above, with none of the disadvantages, but the UI is necessarily more complex, both for the users to use and the developers to build and maintain, which makes this less accessible.
  • Simple ordering by frequency – usage is counted for each option, and the options ordered by the frequency of their use, e.g. if I post in English 50% of the time, 45% in German, and occasionally in Dutch, then the list would show English, German, and Dutch, in that order. While this has many advantages, it can take a while for the ordering to settle if the different options don’t have clearly different preferences. There is also a pathological case where people might use several options almost equally, and statistical noise will lead to frequent re-ordering of the options, and the case where at different times of the day (or one’s life) one interacts with different communities more or less, in different languages (one simple example could be moving to a new country).
  • Order by frecency. Wikipedia says “In computing, frecency is any heuristic that combines the frequency and recency into a single measure.” The most popular example is probably most browsers’ URL input boxes – I know for a fact that Firefox uses a sophisticated frecency algorithm, but I would assume that other browsers do as well. Basically, this algorithm weighs recently used items higher than they would be looking purely at the frequency of their use. A good compromise for most use cases, but for simple items like languages (versus suggested web URLs after a partially typed query), the ordering will likely be unpredictable and opaque for most users, which would be frustrating.

Now, what originally prompted this post was an idea I had while proposing changes to the language selector in Elk: Currently the default post language is the UI display language (not a good choice, they should really be decoupled) and there is no way to configure post language separately. Additionally, the language chooser widget does not show the currently selected option without clicking. All in all, not a great user experience (but one that is common in lots of software, especially when it is developed by a primarily English-speaking group). However, I wanted to make sure that they wouldn’t switch to a simple frequency-based model, because I post almost equally in English and German and I hate nothing more than drop-down menus that rearrange themselves seemingly randomly.

After considering the options outlined above, I thought that maybe another sorting method would be useful, which I admittedly haven’t completely thought through yet, but that I’d like to call “frequency (or frecency) with hysteresis“.

The idea is simple: Instead of looking only at frequency or frecency of items, add a hysteresis threshold.

Hysteresis, in this context, is at its core a “lagging change”, an approach commonly used for e.g. temperature control: Instead of stopping heating exactly at reaching 21.0 degrees C and restarting heating immediately when the temperature falls to 20.9 degrees C, there is a certain window the measurement has to clear first, so the heating would only start at e.g. 20.0 C and then heat up to 21.5 C, instead of constantly flip-flopping between on and off around 21.0 C.

In the context of ordering items in a menu, that would mean that the items do not get re-ordered immediately when their frequency or recency ranking changes, but only after it has changed by a certain amount (or amount of times).

An example: Of my last 10 posts, 5 were in English, and 5 in German. The languages drop-down shows English, German, and then all other languages, because I used English last. Now I post once in English and once in German, and now the ordering would be German, English, others, if going by frequency with last-used-wins. Annoying, because chances are good that the next post would be in English, which is now option #2.

If there was a hysteresis to the ordering, it would not change until I’ve posted significantly more in one language recently, let’s say at least a third more than the previous top spot = German gets the new top spot only once I posted 2 more times (in the above example) in German than English (6/3=2, 6+2=8). The exact thresholds should be determined based on some actual usage data, but this could nicely work around the challenge of people using multiple items with near equal frequency, and avoid the flip-flopping of a simple recency-based approach.

Oh, how history’s repeating

It’s been five years, which in Twitter terms is, like, three eternities, min!

A little over five years ago, I was fed up. With Twitter in particular, but also with the “Web 2.0” in general. It all felt off to me, the competition for the quippiest quip, the most outrageous statement in so many variations, the most interactions, impressions, counters going up. Both on the site and behind the scenes, everybody and everything was replaced with carelessness, it seemed. Product managers and engineers and designers build things they don’t really understand or care about, for users who don’t really care to use to “increase reach” or “activate audiences”.

There is another aspect, and that is control and data sovereignty. Without open APIs, without clear delineations of what data I own, what data someone else has, who controls what data and what is done with it, the power asymmetry of using centralized services has no chance of ever going away. Twitter had been limiting and shutting down their APIs over years, strangling the ecosystem in the process: alternative clients (which would not show ads, or not follow the product “strategy” Twitter supposedly had), bots (some of which were used for evil, as things go), analytics (with their implications for user privacy as well as making Twitter-the-company measurable), and more. It made me dislike Twitter and I did not want to use it anymore.

But of course I did. For five more years, in fact. Because even while it declined in so many ways, it was also reinvigorated, as well as plainly addictive. (Now that I know I have ADHD, many things are much clearer. Hindsight and all that.) Twitter was at once a tool for activism for social justice as well as a tool for disinformation campaigns at any and all scales, some of which were unprecedented (as far as we know). It was emotional, it was activist, it was outrageous, it was communal, it was solidarity. I met new people, made new friends, new enemies, suffered, rejoiced.

No more. All my criticism from the past still holds, and then some new ones. Twitter, as a platform, has become wholly too self-important. Maybe even too actually-important, it’s hard to say. A fitting analogy I saw (on Twitter, of course) was that of a spent fuel pool. The highly radioactive outrage is focused on Twitter, often a tempest in a teapot, and only after it has had its turn, has decayed sufficiently to be exposed to the world at large, does the topic du jour become part of the mainstream narrative – or, in many cases and just as well, it doesn’t. If everything is important, an outrage, a scandal, a shitstorm and a hashtag, might as well that nothing is.

That is evidently false. A great many things, among them social justice, fighting climate change, smashing capitalism and rebuilding a communal society, are indeed important. But if the interactions are limited, broken up into 280 characters, starting a fission reaction if “successful”, releasing a lot of energy and rapidly decaying? That doesn’t do it justice. Neither can the momentum be harvested – everything is in the walled garden, no, reaction chamber, shielded except for very clearly defined interfaces, the most rebellious of which is likely the screencap.

Now, what then? Federated and open networks like Mastodon solve some of the issues, but the social graph is yet strong even as Twitter is trying to damage it beyond repair, and the user experience of federated systems is still lagging. Maybe we just need to manage people’s expectations better? It’s not like any of the centralized platforms gives you access to the world’s population – there’s a lot of social justice to be done before that is even possible – but it sure feels that way for the selected few. Discovery at scale for distributed systems is hard; can we hide that complexity behind a different abstraction than centralized platforms with highly-paid engineers?

A lot of words to say that I have, again, left Twitter. That I don’t see myself coming back this time, but who knows. That I wish things had gone differently with personal websites and blogs and XFN and RDF and all that.

In the (potential) next installments, I’ll look back at the “blogosphere” and its protocols, and how we can try to excavate and reuse some parts of it.