By Edgar T. Irons
Communications of the ACM,
January 1983,
Vol. 26 No. 1, Pages 14-16
10.1145/357980.357986
Comments
Although one generally thinks of a compiler as a program for a computer which translates some object language into a target language, in fact this program also serves to define the object language in terms of the target language. In early compilers, these two functions are fused inextricably in the machine language program which is the compiler. This fusion makes incorporation into the compiler of extensions or modifications to the object language extremely difficult.
This paper describes a compiling system which essentially separates the functions of defining the language and translating it into another. Part 1 presents the meta-language used to define the object language in terms of the target language. This meta-language is an extension of the syntax meta-language used in the ALGOL 60 report which allows specification of meaning (in terms of the target language) as well as of form. This succinct definition allows modifications to the form or meaning of the object language to be incorporated easily into the system, and in fact makes the original specification of the object language a reasonably easy task. Part 2 is a description of the program which utilizes a direct machine representation of the meta-linguistic specifications to effect a translation.
Before proceeding to a description of the meta-language we wish to demonstrate heuristically that the proposed meta-language does suffice to specify a translation for any language it can describe. If one proposes to translate language A into language B, it is necessary to have some kind of description of language A in terms of language B. More specifically, one must be able to describe the alphabet of A in B, and must have a set of rules for assigning meaning in B to various possible structures which can be formed in A by concatenating the characters of A's alphabet. The set of rules might be called the syntax of language A, if one considers definitions (in the usual sense of the word) to be merely additional rules of syntax. A translation process might then be to start with the beginning symbols of the string to be translated and to assign meaning and a new syntactic name to symbol groups as they fall into the several syntactic structures. Having thus formed a new set of syntactic elements, the next step is to modify the meanings or amplify them according to the new structures into which these syntactic elements fall. If one considers the characters of the alphabet to be syntactical units themselves, the two steps in the process are indeed indentical. Evidently the only restriction necessary to make such a description uniquely specify a language is that there be a unique syntactic structure for any possible finite string of symbols in the language.
Given that the syntactic description meets this uniqueness requirement, a translation using the description can be effected by fitting already discovered syntactic units (starting with the syntactic units which are the basic symbols of the language) into the syntactic structure to produce a new set of larger syntactic units, and assign meanings to these new units according to the meanings of the original units. This translation algorithm is then repeated on the new set of larger syntactic units.Since several syntactic units are coalesced effectively into one each time the translation algorithm is performed, the process will converge to exactly one syntactic unit—a “program”. Also the meaning of that unit will have been discovered. In essence, the process is to diagram the string of symbols according to syntax and simultaneously to keep track of the meaning associated with each structure. It therefore suffices to define language A in terms of B by listing a series of “definitions”, each definition listing:- A string of syntactic units in A.
- One syntactic unit of A which is the equivalent of the string.
- The meaning (described in language B) associated with the syntactic structure or, if meaning has already been assigned to some of the syntactic units of 1, modifications or amplifications of these meanings.
When human beings translate one language to another, we would generally consult a subset of rules like these, namely a dictionary, in order to assign basic meanings to concatenated characters of the alphabet. The rules for assigning meaning (or modifying meanings) of larger syntactic units are usually kept in the head of even the amateur translator. But nevertheless, the rules must be referenced to complete any translation and, if we are to ask a computing machine to translate for us, we must be able to make these rules explicit.
The full text of this article is premium content
No entries found
Log in to Read the Full Article
Need Access?
Please select one of the options below for access to premium content and features.
Create a Web Account
If you are already an ACM member, Communications subscriber, or Digital Library subscriber, please set up a web account to access premium content on this site.
Join the ACM
Become a member to take full advantage of ACM's outstanding computing information resources, networking opportunities, and other benefits.
Subscribe to Communications of the ACM Magazine
Get full access to 50+ years of CACM content and receive the print version of the magazine monthly.
Purchase the Article
Non-members can purchase this article or a copy of the magazine in which it appears.