Ordering options by frequency with hysteresis

(Skip to the horizontal divider for the idea if you don’t care about the context.)

With the continued Twitter chaos (popcorn worthy, in a way), the ActivityPub/Mastodon ecosystem has received a ton of additional attention lately. This includes UI/UX work on both the official Mastodon web UI as well as multiple alternative clients on several platforms. I am currently using Tusky on Android, Ivory on iOS, and both the Mastodon “Advanced” UI as well as [Elk](https://elk.zone/) on the web.

One of the features Mastodon exposes prominently is tagging posts with metadata such as the posting language. Unfortunately, the multilingual use case is rarely well served with the current state these features are implemented in: Most clients let the user set exactly one default language for posting, and offer a long, usually alphabetically sorted, list to pick another option. At least there is usually type-ahead search, but it is still far from a great user experience.

Mastodon itself lets users choose several “content languages”, as in “show me content in these languages”. This setting could conceivably double as a list of languages to pin to the top of the language selector widget, but having a separate setting would be better for flexibility. It also stands to reason that most people will be able to read and understand more languages than they are fluent writing in.

Already, there is a bit of a heuristic in Mastodon mainline, namely that it will set the post language of replies to the language of the parent post, which makes sense for threads. Additionally, English is always pinned to near the top of the list, right after the user’s chosen default posting language (which makes some sense, given the widespread use of English as an international communication language, but then we should also ask about why not Mandarin, or Spanish, etc.).

One UI paradigm that is often employed in similar contexts, and one that has also come up in discussions about evolving the language selector feature, is remembering recently used languages. There are more or less sophisticated algorithms here, for example:

  • Strict ordering by recency – the currently selected language will move to the top of the list and push everything else down one slot. Simple, predictable, but a bit annoying if the user frequently posts in different languages.
  • Static ordering from a selection – Of the many languages supported in the standard, the user gets to choose a subset (e.g. with checkboxes) and those get shown first in the widget, in whatever “natural” order they are normally in (e.g. alphabetically, or by database ID) and that order does not change. The advantage is that the user never needs to scan the list visually – click positions remain the same. The disadvantage is, obviously, that the “natural” ordering may not align with the user’s language use frequency at all.
  • Static, user-defined ordering – Same as above, but the user has a way to define the order explicitly. All the advantages as above, with none of the disadvantages, but the UI is necessarily more complex, both for the users to use and the developers to build and maintain, which makes this less accessible.
  • Simple ordering by frequency – usage is counted for each option, and the options ordered by the frequency of their use, e.g. if I post in English 50% of the time, 45% in German, and occasionally in Dutch, then the list would show English, German, and Dutch, in that order. While this has many advantages, it can take a while for the ordering to settle if the different options don’t have clearly different preferences. There is also a pathological case where people might use several options almost equally, and statistical noise will lead to frequent re-ordering of the options, and the case where at different times of the day (or one’s life) one interacts with different communities more or less, in different languages (one simple example could be moving to a new country).
  • Order by frecency. Wikipedia says “In computing, frecency is any heuristic that combines the frequency and recency into a single measure.” The most popular example is probably most browsers’ URL input boxes – I know for a fact that Firefox uses a sophisticated frecency algorithm, but I would assume that other browsers do as well. Basically, this algorithm weighs recently used items higher than they would be looking purely at the frequency of their use. A good compromise for most use cases, but for simple items like languages (versus suggested web URLs after a partially typed query), the ordering will likely be unpredictable and opaque for most users, which would be frustrating.

Now, what originally prompted this post was an idea I had while proposing changes to the language selector in Elk: Currently the default post language is the UI display language (not a good choice, they should really be decoupled) and there is no way to configure post language separately. Additionally, the language chooser widget does not show the currently selected option without clicking. All in all, not a great user experience (but one that is common in lots of software, especially when it is developed by a primarily English-speaking group). However, I wanted to make sure that they wouldn’t switch to a simple frequency-based model, because I post almost equally in English and German and I hate nothing more than drop-down menus that rearrange themselves seemingly randomly.

After considering the options outlined above, I thought that maybe another sorting method would be useful, which I admittedly haven’t completely thought through yet, but that I’d like to call “frequency (or frecency) with hysteresis“.

The idea is simple: Instead of looking only at frequency or frecency of items, add a hysteresis threshold.

Hysteresis, in this context, is at its core a “lagging change”, an approach commonly used for e.g. temperature control: Instead of stopping heating exactly at reaching 21.0 degrees C and restarting heating immediately when the temperature falls to 20.9 degrees C, there is a certain window the measurement has to clear first, so the heating would only start at e.g. 20.0 C and then heat up to 21.5 C, instead of constantly flip-flopping between on and off around 21.0 C.

In the context of ordering items in a menu, that would mean that the items do not get re-ordered immediately when their frequency or recency ranking changes, but only after it has changed by a certain amount (or amount of times).

An example: Of my last 10 posts, 5 were in English, and 5 in German. The languages drop-down shows English, German, and then all other languages, because I used English last. Now I post once in English and once in German, and now the ordering would be German, English, others, if going by frequency with last-used-wins. Annoying, because chances are good that the next post would be in English, which is now option #2.

If there was a hysteresis to the ordering, it would not change until I’ve posted significantly more in one language recently, let’s say at least a third more than the previous top spot = German gets the new top spot only once I posted 2 more times (in the above example) in German than English (6/3=2, 6+2=8). The exact thresholds should be determined based on some actual usage data, but this could nicely work around the challenge of people using multiple items with near equal frequency, and avoid the flip-flopping of a simple recency-based approach.