Translation evaluation demonstrations

IntroductionUp

Both human and program interfaces provide access to the translations in the PanLex database. They permit inspection of translations attested by particular sources. To say that a source attests a translation of distinct expressions ex0 and ex1 into each other (or, more simply, translates them into each other) is the same as saying that some meaning of the source has a denotation whose expression is ex0 and another denotation whose expression is ex1. We call such a translation a distance-1 translation. The interfaces can tell you, for any specified pair of expressions, which sources, if any, attest them as distance-1 translations of each other.

If you know that ex0 and ex1 are distance-1 translations of each other and also that ex1 and ex2 are distance-1 translations of each other, can you infer that ex0 and ex2 are distance-2 translations of each other? The answer depends on how you define “distance-2 translation”.

We also refer to distance-0 translations, meaning the relationship between two expressions when they are actually the same expression.

The investigation of translation inference, i.e. the discovery and evaluation of translations beyond the distance-0 or -1 translations attested by particular sources, has been the main theme of PanLex-related research, exemplified by the PanDictionary project.

Recognizing the potential value of our data for translation inference, we have built into our interfaces some simple demonstrations of the evaluation of attested and inferred translations. We describe here the demonstration algorithms that we expose (or are in the process of exposing) via our interfaces. We are not developing state-of-the-art inference algorithms or trying to advance the state of the art, but we are happy to cooperate with researchers wishing to do so. Meanwhile, users of our interfaces can (if they wish) use these algorithms to evaluate and select translations.

In the descriptions below, we refer to the two expressions in a distance-0 or -1 translation as ex0 and ex1. We refer to the three expressions in a distance-2 translation as ex0, ex1 (the intermediate expression), and ex2. The concept “distance-2 translation” is further defined for each algorithm as needed.

Algorithm tr1q

Demonstration algorithm tr1q estimates the quality of any two expressions as distance-0 or -1 translations of each other. The API and the PanLem, PanLinx, TeraDict, and Tattoo Generator interfaces report attested expressions and their translations, and the API and PanLem, in addition, attach tr1q qualities to them.

Steps:

Determine which sources translate ex0 and ex1 into each other (or, if ex0 is identical to ex1, assign meanings to it).
Determine the source groups of those sources.
For each of those source groups, determine the qualities of the above-identified sources. Treat the maximum of those qualities as the quality of the source group.
Return the sum of those source groups’ qualities.

Algorithm tr2qh

Demonstration algorithm tr2qh estimates the quality of any two expressions as distance-0 or 2 translations of each other. For this purpose it defines ex0 and ex2 as distance-2 translations of each other if (1) some source s0 translates ex0 and some expression ex1 into each other, (2) some source s1 in a source group different from the source group of s0 translates ex1 and ex2 into each other, and (3) ex1 differs from both ex0 and ex2.

This definition can apply in the special case that ex2 is identical to (and thus a distance-0 translation of) ex0,

The motivation for the requirement that s0 and s1 be in distinct source groups is that an ex0–ex1 translation and an ex1 –ex2 by sources in the same source group (usually by the same source) is vulnerable to either of two objections:

The source assigns a single meaning to ex0, ex1, and ex2, so ex0 and ex2 are distance-1 translations and already have an estimated quality based on that. The fact that the source assigns their meaning to ex1, too, does not add to the evidence for ex0 and ex2 being translations of each other.
The source assigns one meaning to ex0 and ex1, and another meaning to ex1 and ex2. That fact provides evidence not for, but arguably against, ex0 and ex2 being translations of each other.

This algorithm assigns a quality to each distance-2 translation chain between ex0 and ex2. Such a translation chain is defined as a set, {ex1, sg0, sg1}, in which ex1 is an intermediate expression, sg0 is a source group containing a source that translates ex0 and ex1 into each other, and sg1 is a source group, different from sg0, containing a source that translates ex1 and ex2 into each other. The quality returned by the algorithm is the sum of the qualities that it assigns to all of the distinct distance-2 translation chains between ex0 and ex2. Given this definition, a single translation by a single source can participate in multiple distance-2 translation chains and thus contribute multiple times to the quality reported by the algorithm.

The API and the PanLem interface report distance-2 translations and also attach tr2qh scores to them.

Steps:

Find every distance-2 translation chain (defined above) between ex0 and ex2.
For each such chain, determine the qualities of its source groups. As for algorithm tr1q, that quality is the maximum of the qualities of the sources in the source group that attest the applicable (ex0–ex1 or ex1–ex2) translation.
For each such chain, compute the chain quality. That is the geometric mean of the qualities of its source groups. The geometric mean is defined as the square root of the product of the qualities. For example, if the source-group qualities were 5 and 8, then the path quality would be the square root of 40, i.e. 6.32.
Return the sum of all of the chain qualities, rounded to the nearest integer.

Algorithm tr2qa

Demonstration algorithm tr2qa estimates the quality of any two expressions as distance-2 translations of each other, in a way different from tr2qh.

One difference is a stricter definition of “distance-2 translation”. Algorithm tr2qa adds two further restrictions to the definition of ex0 and ex2 as distance-2 translations of each other, requiring not only that (1) some source s0 in a source group sg0 translate ex0 and some expression ex1 into each other, (2) some source s1 in a source group sg1 different from sg0 translate ex1 and ex2 into each other, and (3) ex1 differ from both ex0 and ex2, but also that (4) no source in sg0 translate ex1 and ex2 into each other and (5) no source in sg1 translate ex0 and ex1 into each other.

For example, if sources in source groups sg0, sg1, and sg2 (and only those) translate ex0 and ex1 into each other and they (and only they) also all translate ex1 and ex2 into each other, tr2qh defines ex0 and ex2 as distance-2 translations of each other, but tr2qa does not.

These addition restrictions also prevent tr2qa from returning any quality when ex2 is identical to ex0, unlike tr2qh.

Another difference is that tr2qa computes, for any ex1, a total quality for the ex0–ex1 translation and a total quality for the ex1–ex2 translation, and then combines those into a single aggregated quality for all translation chains through that ex1. Unlike tr2qh, it does not compute qualities for individual translation chains.

The API and the PanLem interface report distance-2 translations, but at present only PanLem reports tr2qa scores, and only as part of its translation-evaluation feature.

Steps:

Find every expression ex1 that is a distance-1 translation of both ex0 and ex2 and differs from both of them.
For each such expression ex1, do the following:
1. Identify all the unilateral source groups of ex1. Such a source group is one that has at least 1 source attesting the ex0–ex1 translation or the ex1–ex2 translation and has no source attesting the other translation.
2. For each such source group, determine its quality. As for algorithm tr2qh, that quality is the maximum of the qualities of the sources in the source group that attest the applicable (ex0–ex1 or ex1–ex2) translation.
3. Compute the sum of the qualities of the source groups of the ex0–ex1 translation.
4. Compute the sum of the qualities of the source groups of the ex1–ex2 translation.
5. Compute the geometric mean of those sums.
Return the sum of those geometric means, rounded to the nearest integer.

Algorithm tr012q

Demonstration algorithm tr012q estimates the quality of any two expressions as distance-0, -1, -2, or -1 and -2 translations of each other. PanLem exposes this algorithm.

Motivations

The main motivations for this algorithm (only partly shared by the previously described algorithms) are to:

Provide a quality estimate that considers translations of distances 0, 1, and 2.
Make the algorithm extensible to lengths greater than 2.
Discount multiple attestations from sources in the same source group.
Discount lower source quality.
Discount inferred translations to the extent that intermediate expressions are ambiguous.
Discount longer translation chains.

Definitions

A source attests an expression if any denotation assigns any meaning of the source to the expression.
A source translates two different expressions ex0 and ex1 into each other if denotations assign the same meaning of the source to both ex0 and ex1.
A translation chain between two expressions ex0 and exn is an ordered set of expressions (ex0, ex1, …, exn), such that, for each subset of two adjacent expressions in the set, some source translates the expressions into each other and does not translate the expressions of any other such subset into each other.
A subset of any two adjacent expressions exi and exj in a translation chain, where j = i + 1, is segment i of the chain. The count of segments in a translation chain is the length of the chain.
A translation chain is disjoint if the expressions in each segment are translated into each other by a source in a source group none of whose sources attests any expression in the chain except the expressions of the segment.
A source of a segment of a translation chain is a source that translates the segment’s expressions into each other.
A disjoint source of a segment of a translation chain is a source of the segment that is in a source group none of the sources in which attests any expression in the chain except the expressions of the segment.
A source group of a segment of a translation chain is a source group containing at least one source of the segment.
A disjoint source group of a segment of a translation chain is a source group of the segment containing no source of the segment that is not disjoint.
The quality of a disjoint source group of a segment of a translation chain is the maximum of the qualities of the (disjoint) sources of the segment that are in the source group.
The redundancy of a disjoint source group of a segment of a translation chain is the count of disjoint translation chains between ex0 and exn of which the source group is a disjoint source group of any segment.
The value of a disjoint source group of a segment of a translation chain is the ratio of its quality to its redundancy.
The local ambiguity of an expression ex1 in a source s is the product of the quality of s and the count of the meanings that s assigns to ex1.
The global ambiguity of an expression ex1 is the ratio of the sum of its local ambiguities to the sum of the qualities of the sources that attest it.

Length-0 case

Suppose the two expressions are identical. We can consider the translation chain between the expression and itself as having length 0. Then the algorithm returns an estimate of the quality of that expression. Roughly, the more independent sources attest the expression, and the higher their qualities, the higher the estimated quality returned by tr012q.

Steps:

Determine which sources attest the expression.
Determine the source groups to which those sources belong.
For each of those source groups, determine the maximum of the qualities of the sources in it that attest the expression.
Return the sum of those maximum qualities.

Length-1 case

Suppose the two expressions differ and at least one source translates them into each other, but no disjoint translation chain of length 2 between them exists. Then the only translation chain the algorithm considers has length 1. (It doesn’t consider possible chains longer than 2.) In this case tr012q returns an estimate of the quality of the length-1 translation chain. Roughly, the more independent sources translate the expressions into each other, and the higher the qualities of those sources, the higher the estimated quality.

Steps:

Determine which sources translate the expressions into each other.
Determine the source groups to which those sources belong.
For each of those source groups, determine the maximum of the qualities of the sources in it that translate the expressions into each other.
Return the sum of those maximum qualities.

Length-2 case

Suppose the two expressions differ and a disjoint translation chain of length 2 between them exists. A length-1 translation chain between them may or may not also exist.

In this case, tr012q returns an estimated quality reflecting the translation chains of length 2 and, if any, length-1. Roughly, the estimated quality varies with the number and qualities of attesting sources and discounts redundant and ambiguous attestations.

Let us call the two expressions ex0 and ex2.

Steps:

Determine which sources, if any, translate ex0 and ex2 into each other. If any do, compute the estimated quality of the length-1 translation chain between them, as in the length-1 case. Otherwise, define that quality as 0. That is the estimated quality of ex0 and ex2 as distance-1 translations of each other.
Identify all the disjoint length-2 translation chains (ex0, ex1a, ex2), (ex0, ex1b, ex2), …, (ex0, ex1n, ex2).
For each of those chains (i.e. for each distinct ex1), perform the following steps:
1. Determine the value of each disjoint source group of each segment of the chain.
2. For each segment of the chain, determine the sum of those values.
3. Determine the product of those sums.
4. Determine the ratio of that product to the global ambiguity of ex1.
5. Determine the square root of that ratio.
Determine the sum of those square roots. That is the estimated quality of ex0 and ex2 as distance-2 translations of each other.
Return the sum of the estimated qualities of ex0 and ex2 as distance-1 and distance-2 translations of each other.

Residual case

If the two expressions satisfy the criteria of none of the above cases, the algorithm returns 0 as the estimated quality.

Evaluation

Systematic evaluation of the demonstration algorithms described above has not yet been conducted.

We have correlated tr1q, tr2qh, and tr2qa with judgments of 02016 PanLex interns on the qualities of pairs of expressions selected by the respondents from pairs having translation chains of length 1, 2, or both. In one review of the correlations by Ammon Pike, tr2qa was found to have the best fit among these three algorithms to the human judgments.

Future algorithms

The following incomplete (and possibly obsolete) notes may be useful in the development of algorithms for the evaluation of translation chains of lengths greater than 2.

A distance-n translation of an expression, ex0, is a different expression, ex1, such that ex0 and ex1 are the ends of at least 1 length-n acyclic chain of distance-1 translations. In a chain of translations, the second expression in one translation is the first expression in the next one (if any). A chain is acyclic if no two translations in it have the same first expression or the same second expression. For example, a chain consisting of P→Q, Q→R, R→S, and S→T is acyclic, but P→Q, Q→R, R→S, and S→Q is not.

The quality reported for a distance-n translation between expressions ex0 and ex1 is the sum of the qualities of its independent heterogeneous attestations. A heterogeneous attestation is a chain of attestations in which no source group is the source group of more than 1 attestation in the chain. Two heterogeneous attestations are independent if, when their elements are ordered from expression ex0 to expression ex1, the source group of at least 1 element of one attestation differs from the source group of the corresponding element of the other attestation.

For example, if sources in group 376 and group 1282 attest that “P” and “Q” are translations of each other, and sources in group 376 and group 5777 attest that “Q” and “R” are translations of each other, there are 3 independent heterogeneous attestations of “P” and “R” being distance-2 translations of each other: a 376-5777 chain, a 1282-376 chain, and a 1282-5777 chain.

The quality of an independent heterogeneous attestation is the geometric mean of the qualities of its elements. For example, if the maximum quality of the sources in group 376 that attest the “P”-“Q” translation is 4, and the maximum quality of the sources in group 5777 that attest the “Q”-“R” translation is 9, then the quality of the 376-5777 independent heterogeneous attestation is the geometric mean (nth root of the product) of 4 and 9, i.e. 6.

PanLem and the PanLex API currently report and evaluate only distance-1 and distance-2 translations.

Addendum

Algorithm tr12q

Demonstration algorithm tr12q is being considered for retirement. Its description below is incomplete.

It estimates the quality of any two expressions as distance-1 or -2 translations of each other. By combining evidence of distance-1 and -2 translations into a single quality, tr12q differs from tr1q, tr2qh, and tr2qa.

The algorithm treats distance-1 translations as special cases of distance-2 translations. It recognizes a translation between ex0 and itself, or ex2 and itself, as a special case of distance-1 translation, so it recognizes an ex0–ex2 translation through ex0 or ex2 as a special case of distance-2 translation.

In defining distance-2 translation, tr12q differs from tr2qa by deleting condition 3 and further restricting condition 4. As a result, ex0 and ex2 are distance-2 translations if (1) some source s0 in a source group sg0 translates ex0 and some expression ex1 into each other, (2) some source s1 in a source group sg1 different from sg0 translates ex1 and ex2 into each other, (3) no source in sg0 translates ex2 into any expression that is other expression into each other and (5) no source in sg1 translate ex0 and ex1 into each other.

Another difference between tr12q and the previously described algorithms is that tr12q discounts multiple attestations by sources in the same source group. insofar as multiple sources in the same source group attest the same translation. The PanLem interface reports these combined translations and their tr12q estimated qualities.

Steps:

Identify all expressions ex1 that are distance-1 translations of both ex0 and ex2, permitting any expression (including ex0 and ex2) to be counted as an ex1 and permitting ex0 and ex2 to be identical. Consider each distinct expression ex1 to produce a translation chain, containing two segments (ex0–ex1 and ex1–ex2).
For each translation chain, identify each source group of any sources that attest the segment-0 translation, the maximum of the qualities of those sources, the count of translation chains in which any source in that same source group attests the segment-0 translation. Do the same for segment 1, and the ratio of that maximum to that count. Consider that ratio the chain- and segment-specific quality of the source group.
Determine, for each segment of each translation chain, the sum of its source groups’ qualities.
Determine, for each translation chain, the geometric mean of that sum for segment 0 and that sum for segment 1.
Return the sum of those geometric means.