One of the internally developed interfaces, PanLem, has two main purposes:

  • To provide access to most functionalities of the database required by those who are developing content in it.
  • To test PanLex as a platform for automated panlingual localization.

Database access

You can use PanLem to perform numerous queries on the database. With such queries, you can discover facts and make permitted modifications to the data.

The most common factual queries seek translations. You can get translations by specifying an expression and the language variety you want it to be translated into. The translation inference and fuzzy matching offered by PanLem are features based on simple algorithms not claiming to represent state-of-the-art methods. Some reasonably desirable operations are beyond PanLem and require access to the database through a PostgreSQL client (generally psql). The PanLex API is an alternative interface with which you can obtain facts from the database.

Automated localization

What you see

You can choose to interact with PanLem in more than 500 different language varieties. Why only 500 rather than 11,000? Because PanLex doesn’t yet have enough data to make the interface work well in any of the others. As we add data to PanLex, every few days another language variety gets added to the list.

When you start a session with PanLem, it lists the available language varieties and you choose one.

How it works

In order to automatically localize the interface, we make PanLem communicate lematically. Specifically, every message or label displayed by PanLem is a sequence of 1 or more expression texts. Expression texts in PanLex are all, in principle, lemmas, so this kind of communication is called lemmatic. For example, rather than saying “Count of denotations”, PanLem says (if you are using English) “denotation—count”.

There are a little more than 100 concepts expressed by the expressions in PanLem’s messages and labels. To make programming easier, we have created a concepticon (i.e. a closed artificial language variety), called “PanLem”, that contains one expression for each of those concepts. These PanLem expressions appear in PanLem’s code. Localizing the interface means translating each PanLem expression into the user’s language variety.

Before we include a language variety in the list offered to PanLem users, we require it to have expressions for at least 50 of the concepts. Specifically, a function that runs periodically needs to be able to translate at least 50 of the PanLem expressions into the other language variety. The function tries to find distance-1 translations, and when it can’t then it resorts to distance-2 translations.

In cases where the user is using the interface in a language variety but not all of the PanLem expressions have yielded translations into it, some displayed expressions can’t be in that language variety. In place of those missing expressions, PanLem displays (italicized) the PanLem expressions themselves.


Detailed documentation on the following aspects of PanLem is available: