Computing Semantic Representations with Referent Systems

English German Hungarian

Version 5. Date: Tuesday, April 10, 2007

This project page is about referent systems. It consists of two parts, really:

The manuscript reflects only Referent Systems Version 4. In other words, the progam implements features not yet discussed in the book. However the 4th revision will incrorporate version 5.1. Click for an online demo.
  1. Source Code
  2. Installation
  3. Tk-Interface
  4. Standalone Version
  5. Changelog
  6. To Do
  7. Foreign Language Support
  8. Other Platforms
  9. Acknowledgments

Source Code

The version 5.1 comes with source code. You may use it subject to GNU license terms. Evidently, I decline any responsibility connected with the use of this software. Notice also that what I describe below will work under Unix only (which includes Mac OS X and Linux). If you insist on using Windows, you should probably install Cygwin first. (Read also the howto manual linked to below).
Back to Top

Installation

If you want a standalone version to play with, here it is. The minimal version requires

There is a somewhat easier version using Tcl/Tk, if you want a graphical interface. To install the program, download the file referent_v5-1.tar. It contains the following items: Put the archive into a directory where you want to install the system, say, <RefSys>. Unpack the archive using

tar xvf referent_v5-1.tar

You should have directories dict and bin. dict is where the dictionaries have to be put, bin is for the binaries. Now type

chmod +x bin/*

This makes the files in bin executable.

PATH=full-path-to-RefSys/bin:$PATH

This adds RefSys>/bin to your path. (Otherwise, you will have to prepend each command by bin/!) It is best to add this line to your .bashrc, otherwise you will have to retype it everytime you restart the computer. When you first run them, a new directory parse of <RefSys> will be created. This is the directory where all parses will go. The interactive installers are both in German and English, and you will be prompted for the installation language. This determines in which way the system talks to you and in what language the user interface generates the output. Do not skip this step. Next you will be prompted whether you want to get a system install or not. If running it for the first time (or if you changed the sources of the system itself), say "Yes". That will generate all libraries. If we want to recompile only the dictionaries, say "No". Next you will be prompted whether you want to recompile the dictionaries. After that you can choose between the The errors are redirected into compile.log. You are told whether the installation is successful. If not, take a look at compile.log to see what went wrong.
Back to Top

Tk-Interface

You can create a graphical interface using compile. Since the interface is created via compile, feel free to add to existing dictionaries or change the existing one, as long as you remember to invoke compile after that. To invoke the interface, type rs. It opens a menu where you can enter your sentence word by word, or by typing it into the window. When you call "parse", it creates a file parse.dvi and then uses XDvi to show it to you. It also creates a file

parse/date<date>at<time>.tex

This way it will actually never overwrite any files (unless you happen to create two of them during one minute ...). Make sure to clean the directory parse from time to time.
Back to Top

Standalone Version

You do not need to produce the Tk-interface to use the system. You can also install the standalone version. (The compile script will do that for you.) Simply say referent after that. Then issue #use "dict/deu.ml";; (or whatever the name of the dictionary is you want to use). The system is ready. To parse, issue parse_show "sentence";; after the prompt and watch the result. If things to right, a dvi-viewer pops up and show you the result. For that, you must obviously have LaTeX installed. This interface is much more flexible, and at some point I shall describe its functionality in more detail.
Back to Top

Changelog

Version 3
The main change is in the modularisation of the program. There are also intrinsic changes. The new version separates the syntax and the semantics in the initial parsing stage. During parsing, it only computes the set of viable parse terms. When it is finished, it unravels the viable parse terms into real entries and shows them to the user. Another change is that it now allows for polyadic merge, so it can handle complex predicates correctly.
Version 4
Apart from bug fixes, the new version creates an interactive web page, where users can upload dictionaries and enter sentences. The upload is done in several stages. For security reasons the system will check whether the file looks like a dictionary. If it does not comply with the rules give in the manual, it will not upload. After uploading it will create an executable (if possible) and rehash the pages. A new page is created where one can enter strings in the new dictionary. Entering strings is done by clicking on the items (to avoid awkward keyboard issues for foreign languages).
Back to Top
Version 5.0

The most visible change is the increased support for morphology. Entries have a morphology; this is a morpheme, where a morpheme is a set of morphs. Morphs in turn are complex structures, allowing the use of several strings and features for strings. This allows to treat plural in English, for example, using one entry only, so the semantics need not be iterated. The morphological decomposition tables used in the previous version are now redundant (they were also a source of exponential slowdown).

The Tcl-script is much simplified.

Version 5.1

It is not possible to input non-ASCII symbols with the standalone system using a standard keyboard. The input is through specified sequences, like html, but the coding is flexible.

The exponents are now arrays of so-called glued strings. These are strings with optional conditions what kinds of strings they can be concatenated with. The conditions are of the form "concatenates with a string that has/does not have a suffix suf if appended (prefix pref if prepended)". Variables are now pairs (string, number) and most string output uses buffers, to speed up the output.

To Do


Back to Top

Foreign Language Support

Interface support is not available other than for English and German. For making the dictionaries notice that although OCaml can only handle ISO-Latin-1, you may define strings for your language in other charsets, too. Tk displays them (it is fully Unicode compatible). LaTeX is a bit trickier, since the typewriter font is somewhat incomplete. Apart from that, the way it works is as follows. For the strings in the language, write them into the dictionary in UTF-8. They are passed by OCaml to Tk and LaTeX, which render them assuming UTF-8 encoding throughout. To input into the Tk-Interface I have built a small converter that allows you to use a standard keyboard to enter foreign symbols (a bit like HTML character codes).
Back to Top

Other Platforms

The software (standalone, Tcl/Tk) has been successfully installed both on Unix/Linux and Mac OS X. For that you can use the compile or easy-compile installer, depending on whether you have dialog installed. For Windows, the only way is to install Cygwin before installing everything else. I will need to find out exactly how this can be done.
Back to Top

Acknowledgments

Referent systems are due to Kees Vermeulen. I am indebted to him as well as Albert Visser for the theory part. The implementation has been done by myself. Its creation has been sponsored by two successive senate grants from UCLA. I have enjoyed the help of Cory Hill, Ben Keil, Adam Skory and Joseph Vaughan.


Send any reports of error (or praise) to Marcus Kracht.
Back to Top