PanLex API documentation

IntroductionUp

The latest public API to the PanLex database may be accessed at http://api.panlex.org/v2/ and https://api.panlex.org/v2/. API details are subject to change.

If you have developed with a previous version of the API, you may wish to consult the version 1 documentation.

Existing applications

The following applications query the PanLex API:

  • Global Glossary: a web-based lexical translation reference
  • PanLex Translator: a Chrome browser extension that translates words that the user selects
  • TeraDict: a family of web-based applications translating expressions that the user enters
  • PanLinx: a tree terminating in millions of pages containing lexical translations, designed for search-engine crawling

Developer guide

Contents

HTTP protocol

All API queries may be performed with either HTTP GET or HTTP POST. The type of query (generally the kind of object that will be returned) is specified in the URL. When using GET, additional parameters are specified with URL query parameters. When using POST, additional parameters are specified with a JSON object in the HTTP request body. The POST request body must be a valid JSON object; the empty object {} should be sent if there are no additional parameters.

The main advantage of HTTP GET requests is that they conform to REST and are easier to link. The disadvantage is that some parameters are arrays that might contain many elements, which quickly runs into URL length limits. HTTP POST requests do not have this limitation. We recommend that you use POST requests as the default (for example, when writing an API client or wrapper), since the query size is unrestricted.

The API uses JSON for all responses.

Successful API queries return HTTP status 200. Errors return HTTP status 4xx (the precise value depends on the nature of the error). A more specific description of the error may be found in the JSON response: the code field indicates the category of error, and the message field contains further information. Example code values are ResourceNotFoundError, BadMethodError, MissingParameterError, InvalidArgumentError, InvalidVersionError, and InternalError.

Limits

API users are requested not to perform more than 2 queries per second. The API server will enforce this rate if more than 100 queries are received in rapid succession from a single IP address, responding with HTTP status code 429.

Array parameters may not contain more than 10000 elements.

The returned result array will contain a maximum of 2000 elements.

The global parameter offset may not be greater than 250000.

Array parameters

For query parameters whose value may be an array, it is often the case that array would only have one element (e.g., there is only a single sort value or you are only searching on a single language variety). In such cases, it is possible to simply pass a string or number as appropriate. It will automatically be converted to a one-element array.

Global query parameters and response fields

The following global query parameters are available:

  • after: array of integers or strings containing values of sort fields. Records will be returned that occur immediately after the indicated value(s) in the sort order. Can be used as an alternative to offset. This parameter can only be used with queries returning a result array.
  • cache: boolean value indicating whether to return cached responses. Defaults to true. Set to false if you want to ensure that your response contains the latest data from the database. Cached responses will be no more than 24 hours old.
  • echo: boolean value indicating whether to pass the query back in the response as request, which is an object with the keys url and query. Defaults to false.
  • exclude: array specifying fields to exclude in the response which would otherwise be returned by default. This parameter can only be used with queries returning a result array or taking a URL parameter.
  • include: array specifying extra fields to include in the response. See documentation below for possible values. This parameter can only be used with queries returning a result array or taking a URL parameter.
  • indent: boolean value indicating whether to pretty-print the JSON response. Defaults to false.
  • limit: integer value indicating the maximum number of records to return. Defaults to resultMax, i.e., the maximum. This parameter can only be used with queries returning a result array.
  • offset: integer value indicating how many records to omit from the beginning of the returned records. Defaults to 0; cannot be greater than 250000. This parameter can only be used with queries returning a result array.
  • sort: array of fields to sort the result by. Sort strings take the format <field> or <field> asc for ascending order, <field> desc for descending order. You may also sort by include objects if they are present (as long as they do not return an array). If you sort by the special field random, the result will be returned in random order. The default, if no sort parameter is specified, is to sort by ID in ascending order. This parameter can only be used with queries returning a result array.

The JSON response object can contain the following keys:

  • count: the number of results found, for count queries.
  • countType: string specifying the type of objects in count.
  • request: object representing the query, if echo was on.
  • result: array of result objects, if the query was for a set of objects. Limited to resultMax per query; use offset to get more.
  • resultMax: the maximum number of result objects that will be returned in a single query (currently 2000).
  • resultNum: number of objects returned in result.
  • resultType: string specifying type of objects in result.

Other keys are present for particular query types; see below.

URL parameters

When a query calls for a single language variety, expression, or source, this is passed as a URL parameter. URL parameters are indicated in the documentation below as <definition>, <denotation>, <expr>, <exprtxt>, <langvar>, <meaning>, and <source>, and take the following format:

  • <definition>: definition ID number.
  • <denotation>: denotation ID number.
  • <expr>: expression ID number.
  • <exprtxt>: expression text.
  • <langvar>: language variety ID number or uniform identifier (aaa-000).
  • <meaning>: meaning ID number.
  • <source>: source ID number or label.

The database object corresponding to a URL parameter is always returned in the result object. For example, the object corresponding to <langvar> will be returned as langvar.

Examples

To retrieve information about all language varieties in PanLex, you can send the following query from the command line using curl:

$ curl http://api.panlex.org/v2/langvar -d '{ "indent": true }'

This query requires no additional parameters besides the URL. We could pass an empty JSON object {} in the request body, but for convenience, we set indent to true so the response will be pretty-printed. The response is structured as follows:

{
    "resultType": "langvar",
    "result": [
        {
          "id": 1,
          "lang_code": "aar",
          "var_code": 0,
          "mutable": true,
          "name_expr": 1453510,
          "script_expr": 18147719,
          "uid": "aar-000",
          "name_expr_txt": "Qafár af",
          "name_expr_txt_degr": "qafaraf"
        },
        ...
    ],
    "resultNum": 2000,
    "resultMax": 2000
}

The result array contains a set of language variety objects, as indicated by resultType. resultNum indicates that 2000 results were returned, the maximum for a single query; we could do another query to get more results. (For an explanation of the language variety object, see below.)

Now suppose that we want to retrieve some expressions from Russian. We must first determine the language variety ID or uniform identifier for (a variety of) Russian. If we already know that the language code for Russian is rus, we can look up matching language varieties as follows:

curl http://api.panlex.org/v2/langvar -d '{ "lang_code": "rus" }'

The lang_code parameter says to search for language varieties with matching language codes. The value of lang_code is an array of three-character strings representing ISO 639 language codes. The results contain several language varieties; we pick this one as corresponding to Russian written in Cyrillic:

{
    "id": 620,
    "lang_code": "rus",
    "var_code": 0,
    "mutable": true,
    "name_expr": 43116,
    "script_expr": 17807488,
    "uid": "rus-000",
    "name_expr_txt": "русский",
    "name_expr_txt_degr": "русскии"
}

(We have omitted the indent parameter above for brevity, but will continue to use pretty-printed JSON for the purpose of these examples.)

Now that we know the language variety ID and uniform identifier for Russian (either will do), we can look up some Russian expressions. The following query looks up the expression “дерево” (the Russian word for “tree”):

curl http://api.panlex.org/v2/expr -d '{ "uid": "rus-000", "txt": "дерево" }'

You will see that the result contains a single expression object with the ID 750865. If we want to know this expression’s denotations—i.e., what PanLex sources the expression occurs in, and what translations it is linked to in those sources—we can do a denotation query as follows:

curl http://api.panlex.org/v2/denotation -d '{ "expr": 750865 }'

The results now contain an array of denotation objects, one of which is the following:

{
    "id": 25930350,
    "meaning": 9537585,
    "source": 603,
    "expr": 750865
}

If we want to get more information about the meaning to which this denotation belongs, we can look it up with the following query, specifying that definitions should be included:

curl http://api.panlex.org/v2/meaning -d '{ "id": 9537585, "include": "definition" }'

The result should contain an expr array with the expression IDs that share this meaning, and a definition array containing one definition “растущий” in language variety 620 (which we have already determined is Russian).

Now suppose that we want to translate “дерево” into English. We can do this with an expression query requesting translations of expression 750865 into English:

curl http://api.panlex.org/v2/expr -d '{ "uid": "eng-000", "trans_expr": 750865 }'

You should see that the expression “tree” is one of the results, and the rest are expressions with closely related meanings.

Language variety queries

/langvar

Returns information about a set of language varieties, as result array. Parameters:

  • expr_txt: array of expression texts. Restricts results to language varieties containing a matching expression.
  • expr_txt_degr: array of expression texts. Restricts results to language varieties containing a matching expression in degraded form.
  • grp: array of language variety group IDs.
  • id: array of language variety IDs.
  • include: valid values are denotation_count, expr_count, langvar_charlangvar_cldr_char, region_expr_langvar, region_expr_txt, region_expr_uid, and script_expr_txt.
  • lang_code: array of three-letter ISO 639 language codes.
  • meaning: array of meaning IDs. Restricts results to language varieties with the associated meanings.
  • mutable: boolean value. Restricts results to language varieties that are mutable (if true) or immutable (if false).
  • name_expr: array of language variety default name expression IDs.
  • name_expr_txt: array of language variety default name expression texts.
  • name_expr_txt_degr: array of language variety default name expression texts to be matched in their degraded form.
  • region_expr: array of expression IDs. Restricts results to language varieties with the specified primary regions.
  • region_expr_langvar: array of language variety IDs. Restricts results to language varieties whose primary region expression is in the specified language varieties.
  • region_expr_txt: array of expression texts. Restricts results to language varieties whose primary region expression has one of the specified texts.
  • region_expr_uid: array of language variety uniform identifiers. Restricts results to language varieties whose primary region expression is in the specified language varieties.
  • script_expr: array of language variety art-262 (ISO 15924) expression IDs. Restricts results to language varieties in the specified scripts.
  • script_expr_txt: array of language variety art-262 (ISO 15924) expression texts. Restricts results to language varieties in the specified scripts.
  • trans_expr: array of expression IDs. Restricts results to those language varieties containing a one-hop translation of one of the expressions.
  • uid: array of language variety uniform identifiers.

You can pass any combination of these parameters; results will be returned for matching language varieties. If you do not specify any search parameters, results will be returned for all language varieties in PanLex.

/langvar/count

Returns the number of matching language varieties. Parameters are the same as for /langvar.

/langvar/<langvar>

Returns information about a single language variety, as langvar. The include parameter is the same as for /langvar. There are no other parameters.

Language variety objects

Language variety objects contain the following keys:

  • denotation_count: number of denotations in the language variety (only if in include).
  • expr_count: number of expressions in the language variety (only if in include).
  • grp: language variety group ID number.
  • id: language variety ID number.
  • lang_code: three-letter ISO 639 language code.
  • langvar_char: array of code point ranges (only if in include).
  • langvar_cldr_char: array of exemplar character objects (only if in include).
  • meaning: ID of the art:PanLex meaning associated with the language variety.
  • mutable: boolean value indicating whether the language variety is mutable.
  • name_expr: language variety default name’s expression ID.
  • name_expr_txt: language variety default name’s expression text.
  • name_expr_txt_degr: language variety default name’s degraded expression text.
  • region_expr: language variety’s primary region expression ID.
  • region_expr_langvar: language variety’s primary region expression’s language variety ID (only if in include).
  • region_expr_txt: language variety’s primary region expression’s text (only if in include).
  • region_expr_uid: language variety’s primary region expression’s language variety uniform identifier (only if in include).
  • script_expr: language variety’s script, coded as the language variety art-262 (ISO 15924) expression ID.
  • script_expr_txt: text of the script_expr expression (only if in include).
  • uid: language variety’s uniform identifier.
  • var_code: numeric variety code.

Code point ranges

The code point range is an array representing a range of permissible Unicode characters for a language variety. The array takes the form [first, last], where first is the numeric value of the first code point in the range and last is the value of the last code point in the range.

For example, for English (language variety eng-000), the first code point object is [32, 33]. This includes the range from U+0020 (SPACE) to U+0021 (EXCLAMATION MARK). Note that JSON numeric values are always decimal.

CLDR character objects

CLDR character objects represent the exemplar characters for a language variety, as defined by the Unicode Common Locale Data Repository. They contain the following keys:

  • category: character category, typically “pri” (primary/standard), “aux” (auxiliary), or “pun” (punctuation).
  • locale: Unicode script locale abbreviation.
  • range: a code point range (see above).

Expression and translation queries

/expr

Returns information about the specified expressions, as result array. This is also the endpoint for translation queries. Parameters:

  • id: array of expression IDs.
  • include: valid values are expr_score,
    trans_langvar, trans_path, trans_quality, trans_txt, trans_txt_degr, trans_uid, and uid.
  • interm1_expr_diff_langvar: boolean indicating whether a distance-2 translation’s intermediate expression must be in a different language variety than the translation’s starting and ending expression. Default false.
  • interm1_expr_langvar: array of language variety IDs. Restricts results to expressions whose distance-2 intermediate expression is in the specified language varieties.
  • interm1_expr_uid: array of language variety uniform identifiers. Restricts results to expressions whose distance-2 intermediate expression is in the specified language varieties.
  • interm1_grp: array of source group IDs. Restricts results to expressions whose distance-2 translation’s ending (“intermediate”) source is one of the specified sources.
  • interm1_source: array of source IDs. Restricts results to expressions whose distance-2 translation’s ending (“intermediate”) source is one of the specified sources.
  • lang_code: array of three-letter ISO 639 language codes. Restricts results to expressions in varieties of the specified languages.
  • langvar: array of language variety IDs. Restricts results to expressions in the specified language varieties.
  • mutable: boolean value. Restricts results to expressions from language varieties that are mutable (if true) or immutable (if false).
  • range: array of the form [field, start, end]. Restricts results to expressions whose field value is alphabetically between the start and end strings. field may be “txt” or “txt_degr”.
  • trans_distance: integer specifying the number of translation hops. Pass 1 for one hop (direct or distance-1 translation), 2 for two hops (indirect or distance-2 translation). Defaults to 1. Only relevant if you are translating. Note that if you set this to 2, for performance reasons we recommend that you specify the source expression(s) with trans_expr rather than one of the alternatives.
  • trans_grp: array of source group IDs. Restricts results to expressions that are translations originating in the specified source groups.
  • trans_expr: array of expression IDs. Restricts results to expressions that are translations of the specified expressions.
  • trans_langvar: array of language variety IDs. Restricts results to expressions that are translations of the specified language varieties’ expressions.
  • trans_quality_algo: string specifying the translation quality algorithm. Valid values are “geometric” (the default) and “arithmetic”. See below for details. Only relevant when trans_distance is 2.
  • trans_quality_min: non-negative integer specifying a minimum translation quality. Translations with a lower quality will be discarded. Defaults to 0, i.e., no minumum. Only relevant if you are translating.
  • trans_source: array of source IDs. Restricts results to expressions that are translations originating in the specified sources.
  • trans_source_quality_min: integer from 0 to 9 specifying a minimum source quality. Translations derived from sources with a lower quality will be discarded. Defaults to 0, i.e., no minumum. Only relevant if you are translating.
  • trans_txt: array of expression texts. Restricts results to expressions that are translations of expressions with matching texts.
  • trans_txt_degr: array of expression texts. Restricts results to expressions that are translations of expressions with matching texts in their degraded form.
  • trans_uid: array of language variety uniform identifiers. Restricts results to expressions that are translations of expressions in the specified language varieties.
  • txt: array of expression texts.
  • txt_degr: array of expression texts to be matched in their degraded form.
  • uid: array of language variety uniform identifiers. Restricts results to expressions from the specified language varieties.

You must provide at least one of the parameters other than include or mutable. If you are translating, you must provide at least one of the trans_expr, trans_txt, or trans_txt_degr parameters. Results will be returned for all matching expressions.

/expr/count

Returns the number of matching expressions. Parameters are the same as for /expr, but there are no required parameters.

/expr/<expr>

Returns information about a single expression, as expr. The include parameter is the same as for /expr (but only uid makes sense here). There are no other parameters.

/expr/<langvar>/<exprtxt>

Returns information about a single expression (in variety <langvar> and with text <exprtxt>), as expr. The include parameter is the same as for /expr (but only uid makes sense here). There are no other parameters.

/expr/index

This query produces an alphabetically sorted index of expressions in the specified language varieties, or in all varieties in PanLex. Parameters:

  • langvar: array of language variety IDs.
  • step: the number of expressions summarized in each index item. Required; minimum 250.
  • uid: array of language variety uniform identifiers.

Expressions are first sorted by their degraded expression text, then divided into chunks of size step. The result is returned as the index array. Elements of index are arrays containing two expression objects each, representing the first and last expression from each index chunk.

Because this query can produce large responses, the indent parameter is ignored.

Expression objects

Expression objects contain the following keys:

  • expr_score: expression score, equal to the expression normalization score (only if specified in include). Retrieved from a cache that is updated daily.
  • id: expression ID number.
  • langvar: expression’s language variety ID number.
  • trans_expr: ID number of expression from which the expression was translated (only if a translation parameter was specified in the query).
  • trans_langvar: language variety ID for expression from which the expression was translated (only if specified in include and a translation parameter was specified in the query).
  • trans_path: array of translation paths used to produce the translation (only if specified in include and a translation parameter was specified in the query). Each translation path is an array of translation hop objects (see below), one for each hop in the translation, in order from beginning to end. Thus, trans_path contains an array of arrays.
    • A translation hop consists of a PanLex meaning with a beginning and end denotation. Expressions tie hops together: one hop’s end denotation has the same expression as the following hop’s beginning denotation. The term “distance-n translation” (where n is typically 1 or 2) refers to a translation with n hops.
    • Each translation hop object has the following keys: meaning, containing the meaning ID; source, containing the source ID; denotation1, containing the beginning denotation ID; denotation2, containing the end denotation ID; and (unless it is the final hop) expr2, containing the ID of the expression that ties the hop to the next one, and langvar2, containing the language variety ID of expr2.
    • The following is an example trans_path object for a distance-1 translation from eng-000 (English) “bat” into spa-000 (Spanish) “murciélago”: [ { "meaning": 28118413, "source": 5944, "denotation1": 83715137, "denotation2": 83715210 } ]. Since it is a distance-1 translation, there is only one object in the array. The translation is documented in meaning 28118413, which is in source 5944 (fra-mul:Sérasset). The beginning denotation (of “bat”) is 83715137, and the end denotation (of “murciélago”) is 83715210.
  • trans_quality: translation quality score (only if specified in include and a translation parameter was specified in the query). For trans_distance 1, it is the sum of the quality value of all sources from distinct source groups attesting the translation. The same algorithm is used for trans_distance 2 when trans_quality_algo is “arithmetic”, combining the sources from both hops for the purpose of the score. When trans_quality_algo is “geometric” (the default), it is the sum, rounded to the nearest integer, of the geometric mean of each distinct translation path’s two quality values. Distinctness in this context is defined by the combination of the intermediate expression linking the two hops and the source groups of the two sources. See translation evaluation for more.
  • trans_txt: text of expression from which the expression was translated (only if specified in include and a translation parameter was specified in the query).
  • trans_txt_degr: degraded text of expression from which the expression was translated (only if specified in include and a translation parameter was specified in the query).
  • trans_uid: language variety uniform identifier for expression from which the expression was translated (only if specified in include and a translation parameter was specified in the query).
  • txt: expression text.
  • txt_degr: degraded expression text.
  • uid: expression’s language variety uniform identifier (only if in include).

Denotation queries

/denotation

Returns information about the specified denotations, as result array. Parameters:

  • denotation_class: array of denotation classification arrays. Each array should contain two elements: the superclass expression ID (null if none) and the class expression ID (null to match all expressions).
  • denotation_prop: array of denotation property arrays. Each array should contain two elements: the attribute expression ID and the property string (null to match all strings).
  • expr: array of expression IDs.
  • id: array of denotation IDs.
  • include: valid values are denotation_class and denotation_prop.
  • langvar: array of language variety IDs. Restricts results to denotations of expressions in the specified language varieties.
  • meaning: array of meaning IDs.
  • source: array of source IDs.
  • uid: array of language variety uniform identifiers. Restricts results to denotations of expressions in the specified language varieties.

You must provide at least one of the expr, id, langvar, meaning, source, or uid parameters. Results will be returned for all matching denotations.

/denotation/count

Returns the number of matching denotations. Parameters are the same as for /denotation, but there are no required parameters.

/denotation/<denotation>

Returns information about a single denotation, as denotation. The include parameter is the same as for /denotation. There are no other parameters.

Denotation objects

Denotation objects contain the following keys:

  • denotation_class: array of denotation classifications (only if in include). Each denotation classification is a two-element array consisting of the superclass expression ID and the class expression ID.
  • denotation_prop: array of denotation properties (only if in include). Each denotation property is a two-element array consisting of the attribute expression ID and the property string.
  • expr: expression ID number.
  • id: denotation ID number.
  • meaning: meaning ID number.
  • source: source ID number.

Meaning queries

/meaning

Returns information about a set of meanings, as result array. Parameters:

  • expr: array of expression IDs. Restricts results to meanings containing all of the specified expressions.
  • id: array of meaning IDs.
  • include: valid values are definition, meaning_class, and meaning_prop.
  • meaning_class: array of meaning classification arrays. Each array should contain two elements: the superclass expression ID (null if none) and the class expression ID (null to match all expressions).
  • meaning_prop: array of meaning property arrays. Each array should contain two elements: the attribute expression ID and the property string (null to match all strings).
  • source: array of source IDs. Restricts results to meanings from the specified sources.

You must provide at least one of the expr, id, or source parameters. Results will be returned for all matching meanings.

/meaning/count

Returns the number of matching meanings. Parameters are the same as for /meaning, but there are no required parameters.

/meaning/<meaning>

Returns information about a single meaning, as meaning. The include parameter is the same as for /meaning. There are no other parameters.

Meaning objects

Meaning objects contain the following keys:

  • definition: array of definition objects (only if in include). Definition objects are the same as for definition queries (see below), with the meaning key omitted.
  • denotation: array of IDs of denotations of the meaning.
  • expr: array of IDs of expressions with the meaning.
  • id: meaning ID number.
  • meaning_class: array of meaning classifications (only if in include). Each meaning classification is a two-element array consisting of the superclass expression ID and the class expression ID.
  • meaning_prop: array of meaning properties (only if in include). Each meaning property is a two-element array consisting of the attribute expression ID and the property string.
  • source: source ID number.

Definition queries

/definition

Returns information about a set of definitions, as result array. Parameters:

  • expr: array of expression IDs. Restricts results to definitions of meanings of the specified expressions.
  • expr_langvar: array of language variety IDs. Restricts results to definitions of meanings of expressions in the specified language varieties.
  • expr_txt: array of expression texts. Restricts results to definitions of meanings of expressions with matching texts.
  • expr_txt_degr: array of expression texts. Restricts results to definitions of meanings of expressions with matching texts in their degraded form.
  • expr_uid: array of language variety uniform identifiers. Restricts results to definitions of meanings of expressions in the specified language varieties.
  • id: array of definition IDs.
  • include: valid values are expr_langvar, expr_txt, expr_txt_degr, expr_uid, and uid.
  • langvar: array of language variety IDs. Restricts results to definitions in the specified language varieties.
  • meaning: array of meaning IDs. Restricts results to definitions of the specified meanings.
  • txt: array of definition texts.
  • txt_degr: array of definition texts to be matched in their degraded form.
  • uid: array of language variety uniform identifiers. Restricts results to definitions in the specified language varieties.

You must provide at least one parameter (other than include). Results will be returned for all matching definitions.

/definition/count

Returns the number of matching definitions. Parameters are the same as for /definition, but there are no required parameters.

/definition/<definition>

Returns information about a single definition, as definition. The include parameter is the same as for /definition. There are no other parameters. There are no other parameters.

Definition objects

Definition objects contain the following keys:

  • expr: ID number of the expression with which the definition shares a meaning (only if one of the expr parameters was specified in the query).
  • expr_langvar: language variety ID of the expression whose meaning is defined (only if in include, and one of the expr parameters was specified in the query).
  • expr_txt: text of the expression whose meaning is defined (only if in include, and one of the expr parameters was specified in the query).
  • expr_txt_degr: degraded text of the expression whose meaning is defined (only if in include, and one of the expr parameters was specified in the query).
  • expr_uid: language variety uniform identifier of the expression whose meaning is defined (only if in include, and one of the expr parameters was specified in the query).
  • id: ID number of the definition.
  • langvar: ID number of the language variety in which the definition is written.
  • meaning: ID number of the meaning to which the definition belongs.
  • txt: text of the definition.
  • txt_degr: degraded text of the definition.
  • uid: uniform identifier of the language variety in which the definition is written (only if in include).

Source queries

/source

Returns information about the specified sources, as result array. Parameters:

  • expr: array of expression IDs. Restricts results to sources containing all of the specified expressions, whether in the same meaning or not.
  • format: array of source formats.
  • grp: array of source group IDs.
  • id: array of source IDs.
  • include: valid values are denotation_count, denotation_count_estimate, directory, formatlangvar, langvar_attestedmeaning_count, and usr.
  • label: array of source labels.
  • langvar: array of language variety IDs. Restricts results to sources with those declared language varieties.
  • meaning: boolean value. Restricts results to sources with one or more meanings (if true) or no meanings (if false).
  • trans_expr: array of expression IDs. Restricts results to sources with at least one meaning that contains all of the specified expressions.
  • uid: array of language variety uniform identifiers. Restricts results to sources with those declared language varieties.
  • usr: array of PanLem usernames. Restricts results to sources with at least one of the PanLem users as a meaning editor.

Results will be returned for all matching sources. If you do not specify a search parameter, results will be returned for all sources in PanLex. You cannot specify expr and trans_expr simultaneously.

/source/count

Returns the number of matching sources. Parameters are the same as for /source.

/source/<source>

Returns information about a single source, as source. The include parameter is the same as for /source. There are no other parameters.

Source objects

Source objects contain the following keys:

  • author: author(s).
  • denotation_count: number of denotations of the source’s meanings (only if in include).
  • denotation_count_estimate: estimated number of denotations of the source’s meanings prior to ingestion (only if in include).
  • directory: name of directory in source archive (only if in include; mainly for internal use).
  • format: source formats (only if in include).
  • grp: ID of source group to which the source belongs.
  • id: source ID number.
  • ip_claim: summary of intellectual property claim (if known).
  • ip_claimant: intellectual property claimant (if known).
  • ip_claimant_email: intellectual property claimant’s email address (if known).
  • isbn: ISBN number.
  • label: label.
  • langvar: array of IDs of language varieties declared as documented in the source (only if in include).
  • langvar_attested: array of IDs of language varieties attested in the source’s denotations (only if in include).
  • license: license type; can be “copyright”, “Creative Commons”, “GNU Free Documentation License”, “GNU General Public License”, “GNU Lesser General Public License”, “MIT License”, “other”, “PanLex Use Permission”, “public domain”, “request”, or “unknown”.
  • meaning_count: number of meanings in the source (only if in include).
  • note: miscellaneous notes.
  • publisher: publisher.
  • quality: quality rating assigned by PanLex editor (0 = lowest, 9 = highest).
  • reg_date: date added to PanLex.
  • title: title.
  • url: URL.
  • usr: array of meaning editors’ PanLem usernames (only if in include).
  • year: year of publication.

Normalization queries

/norm/expr/<langvar>

Returns normalization scores and normalized texts for a set of expression texts in a language variety, as norm. Parameters:

  • degrade: boolean value indicating whether to compare the degraded text of each value in txt against the degraded text of existing expressions in PanLex. Defaults to false.
  • grp: array of source group IDs. Meanings from these source groups will be ignored when calculating scores. Defaults to an empty array.
  • txt: array of expression texts to normalize.

The returned norm object maps each expression text, as a key, to an object (when degrade is false) or an array of objects (when degrade is true) containing normalization information. The object or objects’ score key contains the expression text’s normalization score. This is the sum of the quality ratings (quality) of the sources of the expression’s denotations. (Multiple sources from the same source group are counted as a single attestation for this purpose.) Thus, the more sources attest the existence of an expression, the higher its score, but the score is weighted by source quality. If no expression exists with the corresponding text, the score will be zero.

When the degrade option is used, the returned array of objects contains scores for all expressions whose degraded texts (their txt_degr values) match the degraded texts of the supplied txt values. The objects’ txt key contains each expression’s text. The array is sorted by score in descending order.

/norm/definition/<langvar>

Returns normalization scores and normalized texts for a set of definition texts in a language variety, as norm. Parameters:

  • degrade: boolean value indicating whether to compare the degraded text of each value in txt against the degraded text of existing definitions in PanLex. Defaults to false.
  • grp: array of source group IDs. Meanings from these source groups will be ignored when calculating scores. Defaults to an empty array.
  • txt: array of definition texts to normalize.

The returned norm object maps each definition text, as a key, to an object or array of objects containing normalization information. Its format and the algorithm used are the same as for expression normalization (see above).

Text degradation queries

/txt_degr

Returns degraded texts for arbitrary input texts, as txt_degr. Parameters:

  • txt: array of texts to degrade.

The returned txt_degr object maps each input text (as a key) to its degraded text.

Suggestion queries (for autocomplete and similar)

These queries let you retrieve matching database objects based on a search string. They were created to provide suggestions for autocompletion of text fields, but could be used for other things in principle.

/suggest/expr_trans

Suggests a list of expressions, optionally restricted by language variety and/or source, with translations. Parameters:

  • include: valid values are trans_quality and uid. If you include uid, suggested expressions will distinguish those in different varieties that share an expression text. If you do not include uid, these will be considered a single suggestion.
  • langvar: array of IDs of language varieties to suggest expressions from.
  • limit: maximum number of suggestions to return. Default 50; cannot be more than 100.
  • no_trans_langvar: array of IDs of language varieties from which to exclude translations.
  • pref_trans_langvar: array of IDs of language varieties from which to prefer translations. If enough translations are not found in these varieties, other varieties will be used instead.
  • prefix: boolean value indicating whether txt should match the beginning of the expression text only. Default false.
  • sort_langvar: array of IDs of language varieties in order of sort priority. When suggested expressions are found in multiple varieties, they will be sorted in the order given here unless other criteria take precedence (see below).
  • source: array of IDs of sources to look up translations from. If the parameter is omitted, all sources will be used.
  • trans_limit: maximum number of translations per suggestion. Default 3; cannot be more than 10.
  • txt: search string from which to generate suggestions. This parameter is required.

Response objects contain two keys: suggestType (“expr_trans”) and suggest, an array of suggestion objects. Suggestion objects contain the following keys:

  • id: expression ID number (only if uid is in include).
  • langvar: expression’s language variety ID (only if uid is in include).
  • name_expr_txt: expression’s language variety default name expression text (only if uid is in include).
  • trans: array of translation objects, each containing the following keys:
    • pref_trans_langvar: array of language variety IDs indicating which varieties passed in pref_trans_langvar attest this translation, or null if none do.
    • trans_quality: translation quality score (only if in include).
    • txt: translation expression text.
  • txt: expression text.
  • uid: expression’s language variety uniform identifier (only if uid is in include).

Suggestion objects are sorted according to the following criteria, in order of precedence:

  1. Whether the suggested expression’s text matches the search string at the beginning of a word boundary.
  2. Whether one of the suggested expression’s translations matches the search string at the beginning of a word boundary in one of the pref_trans_langvar varieties.
  3. Whether one of the suggested expression’s translations matches the search string at the beginning of a word boundary in any language variety.
  4. Whether one of the suggested expression’s translations matches the search string in degraded form.
  5. The suggested expression’s language variety, in sort_langvar order.
  6. The suggested expression’s translations, sorted alphabetically in degraded form.
  7. The suggested expression’s translations, sorted alphabetically in exact form.

/suggest/langvar

Suggests a list of language variety uniform identifiers, with translations (i.e., language variety names) from source art:PanLex. Parameters:

  • include: valid values are expr_count,
    grp, meaning, mutable, name_expr, region_expr, region_expr_langvar, region_expr_txt, script_expr, script_expr_langvar, script_expr_txt.
  • limit: maximum number of suggestions to return. Default 50; cannot be more than 100.
  • pref_trans_langvar: array of IDs of language varieties from which to prefer translations. The first translation is always the language variety’s default name. Remaining translations will be preferred from these varieties. If enough preferred translations are not found, other varieties will be used instead.
  • prefix: boolean value indicating whether txt should match the beginning of the expression text only. Default false.
  • trans_limit: maximum number of translations per suggestion. Default 3; cannot be more than 10.
  • txt: search string from which to generate suggestions. This parameter is required.

Response objects contain two keys: suggestType (“langvar”) and suggest, an array of suggestion objects. Suggestion objects contain the following keys:

  • expr_count: total number of expressions in the language variety (only if in include).
  • grp: language variety group ID (only if in include).
  • id: language variety ID.
  • meaning: ID of the art:PanLex meaning associated with the language variety (only if in include).
  • mutable: boolean value indicating whether the language variety is mutable (only if in include).
  • name_expr: language variety default name’s expression ID (only if in include).
  • region_expr: language variety’s primary region expression ID (only if in include).
  • region_expr_langvar: language variety’s primary region expression’s language variety ID (only if in include).
  • region_expr_txt: language variety’s primary region expression text (only if in include).
  • script_expr: language variety’s script, coded as the language variety art-262 (ISO 15924) expression ID (only if in include).
  • script_expr_langvar: language variety’s script expression’s language variety ID (only if in include).
  • script_expr_txt: language variety’s script expression text (only if in include).
  • trans: array of translation objects, each containing the following keys:
    • pref_trans_langvar: array of language variety IDs indicating which varieties passed in pref_trans_langvar attest this translation, or null if none do.
    • txt: translation expression text.
  • uid: language variety uniform identifier.

Suggestion objects are sorted according to the following criteria, in order of precedence:

  1. Whether the suggested language variety’s uniform identifier matches the search string at the beginning of a word boundary.
  2. Whether one of the suggested language variety’s names matches the search string at the beginning of a word boundary in the default name or one of the pref_trans_langvar varieties.
  3. Whether one of the suggested language variety’s names matches the search string at the beginning of a word boundary in any language variety.
  4. Whether one of the suggested language variety’s names matches the search string in degraded form.
  5. The number of expressions in the suggested language variety, from most to least.
  6. The suggested language variety’s uniform identifier in alphabetical order.

Fallback queries (chaining several queries)

/fallback

Takes a list of queries, runs them in sequence until one of them returns more than zero results, and returns the results of that query. Each query in the list must return its results in the result array. Parameters:

  • requests: array of PanLex API request objects. Each object should have two keys: url, a string containing the request URL path (for example, /expr or /v2/expr) and query, an object containing the query parameters. The maximum number of requests is 10.

The response contains data from the first request in the list returning at least one item in its result array. The requestNum key is set to the index of the request in requests that returned results (starting from 0). If no request returned results, the final request’s response will be returned with its empty result array.

Clients

PanLex API client modules are available for node.js, Python, Perl (as part of the PanLex tools), and Ruby.