Standard acquisition tools | PanLex development

IntroductionUp

Acquisition relies mainly on standard information-processing tools, described below.

Computer hardware

For acquisition work, we recommend that you use at least a mid-range portable computer. Examples of staff-recommended configurations as of November 02015:

Computers with solid-state drives:

Dell Latitude E5440, $1,000
Acer VN7-571G-719D, $950
Lenovo IdeaPad Y40, $1,200 (no optical drive)

Computers without solid-state drives (slower random-access retrieval from storage, e.g. when reading randomly through large PDF files):

Acer Aspire E5-571-74F7, $770
Lenovo ThinkPad E550 (20DF0040US), $730
13-inch MacBook Pro (Z0MT0LL/A), 2.5GHz Dual-core Intel Core i5, 8GB 1600MHz DDR3 SDRAM, 500GB Serial ATA Drive @ 5400 rpm, SuperDrive 8x (DVD±R DL/DVD±RW/CD-RW), $1,200

Computer software

In most cases, acquisition requires standard software, such as web browsers, PDF viewers, and text editors.

Some documentary resources are organized in a way that makes it advantageous to use tools designed for rule-based procurement of files from servers. For example, some resources are composed of hundreds, or thousands, of web documents (usually not really existing until you ask for them, and then they are created on demand). Tools allow you to download these with a single command. The most common such tools are named wget and cURL. Generally they are run from a terminal window, where you enter commands that name the program and define the parameters of the particular job that you want it to do.

Some of our acquisition involves a partnership between PanLex and the Internet Archive. We donate acquired book-format resources to the Internet Archive, which digitizes them, makes them available to borrowers everywhere digitally, and makes available to us for assimilation the digital files. One tool for dealing with the digital files and the Internet Archive metadata is the Internet Archive Python Module and its command-line tool.

Field acquisition tools

Projects under way may create tools that PanLex can use to obtain data directly from persons who have applicable knowledge or sources.

Examples of tools that may become useful:

Kamusi Fidget Widget (use case described in 02014 paper)
Aikuma App