archives of the CONLANG mailing list ------------------------------------ Subject: Lord's prayer Date: Fri, 01 May 92 10:36:04 +1000 From: nsn@mullian.ee.Mu.OZ.AU (Nick Nicholas) Oh, one supplementary note to my previous posting, which I fear And didn't get: it's "forgive [allow, let off; but the term for "remission of sins" comes from the same word] us our debts {just like we /so that me may} let off our debtors." The just like/so that ambiguity depends on whether aphiemen is a subjunctive or indicative; but maybe "o:s" plays up with its complements anyway. Sigh. Take it to a hellenist :( --- 'Dera me xhama t"e larm"e, T Nick Nicholas, EE & CS, Melbourne Uni Dera mbas blerimit | Mail: nsn@munagin.ee.mu.oz.au Me xhama t"e larm"e! | "Omiloume ellhnika/Esperanto parolata/ Lumtunia nuk ka ngjyra tjera.' | {mika'e tavla baula lojban.je'uru'e} - Martin Camaj, _Nj"e Shp'i e Vetme_ | (Better .sig suggestions welcome) Date: 05 May 92 01:37:43 EDT From: Don Harlow <72627.2647@CompuServe.COM> To: Conlang Subject: For the Birds To: Conlang >INTERNET:conlang@buphy.bu.edu Dato: 920504 Prentiss Riddle asks about possible Zamenhofian arbitrariness: >The classic example is "birdo", obviously derived from English "bird", >when a much more widely understood European root would lead to >something like "avo". "Avo" is of course the Esperanto word for >"grandfather"; was it then "arbitrary" of Zamenhof to select a root >seemingly at random from one of the other DEFIRS languages? Or were >there still more root conflicts which he took into account before >settling on "birdo" as the only remaining available choice? Let me quote (or, better, translate) Vilborg's entry for "birdo": BIRD/O (1; Un): A _bird_. Form based on A orthography without consideration of pronunciation as in e.g. _boato_, _gurdo_. Choice of root may have been inspired by Vp _b"od_ (Golden 1979c). 1894 _ukcel-_ ((On a side note, does anyone know the origin of the DEFIRS acronym? I >assume it stands for "Deutsch, English, French, Italian, Russian and >Spanish.") I think you're right. This was not, of course, the acronym used for the languages from which Esperanto's vocabulary was largely taken; that would be DEFIRP+LG (Deutsch, English, French, Italian, Russian and Polish, plus Latin and Greek); but some modern Esperantist lexicographers like to pretend that Esperanto is a DEFIRS language, since that particular choice of source languages gives modern Romance languages a plurality, if not an absolute majority, which they didn't have for Zamenhof. Said lexicographers are still gritting their teeth over such words as _cxanojo_, _vejcxio_, _cxigongo_, _hibaksxo_, _usxuo_, etc., which don't fall into the DEFIRS classification by some thousands of miles and tens of thousands of years ... ============================================= Don Harlow Redaktoro Esperanto U.S.A. tel. (1 510) 222 0187 CompuServe [72627,2647] Internet 72627.2647@compuserve.com ============================================= Date: 05 May 92 01:36:15 EDT From: Don Harlow <72627.2647@CompuServe.COM> To: Conlang Subject: RAH & Loglan/Lojban To: Conlang >INTERNET:conlang@buphy.bu.edu Dato: 920504 Rick Harrison kindly sent me a copy of his "Journal of Planned Languages" issue #14. After his comments on the technical aspect of the old "International Language Review" a couple of weeks ago (I remember ILR with great fondness, a fondness perhaps augmented by the space of time lying between then and now), I was a little disappointed with the size and appearance of JPL; but the contents were certainly interesting, more wide-ranging than those that appeared in ILR and generally less tendentious (ILR during the period I subscribed concentrated mostly on Ido, Occidental, Interlingua, Neo, and the evils of Esperanto). One article somewhat more tendentious than the others caught my eye: Rick Morneau's "On the unsuitability of 'logical languages' for use as interlinguas in machine translation." Morneau writes: "In my opinion, Lojban is much less tractable than it _should_ be ... [as] a serious candidate ... Also ... Lojban has features that actually make it _unsuitable_ for use as an MT IL." Since Morneau's arguments were taken from this forum, I expect that you'll all be familiar with them. I'm neither prepared nor qualified to say whether or not they're correct. But, _assuming_ (for the sake of argument) that they are, the question arises: why, then, have the designers of Loglan and Lojban put so much effort into making the language _parseable_ (i.e. computer-tractable) rather than _merely_ speakable? One clue lies in lojbab's Apr. 29 report on the recent Court of Appeals decision regarding cancellation of the trademark on the name "Loglan." lojbab writes: >It is less likely than ever that JCB and TLI can interfere in our >efforts to reach out to all those who have heard of Loglan in Scientific >American, Heinein's [sic] books, and elsewhere, before the fully public Lojban >version of Loglan was developed. lojbab certainly has a better idea than I do of where "elsewhere" might be; the only public mentions of Loglan with which I'm familiar were indeed the 1960 SA article and the "books" of Robert Heinlein. About which the following: For those who may not know, Robert Anson Heinlein was perhaps America's best-known science-fiction author of the 1940's and 1950's. After graduating from Annapolis in the late twenties and then being invalided out of the navy for TB, he turned his hand to a number of things in which, because of the Great Depression, he was not notably successful. He began writing science-fiction in 1939, and went on to become the first modern SF author to break out of the genre publications into the more general literary magazines such as "Saturday Evening Post." He continued to write SF almost until his death in the late eighties, but his best and most prolific period ended in about 1959. Heinlein was respected as one of those rare authors of SF who really knew how things work. If you want a quick basic course in ecology, read his _Farmer in the Sky_. If you want to know how to run a revolution, read _Red Planet_. For advice on how to manipulate a parliamentary political system, try _Double Star_. Space suit maintenance and operation are described at length in _Have Space Suit, Will Travel_ (which, despite the name, is in my humble opinion one of the best SF books ever written). And if you happen to belong to a family of geniuses, and want to know how to keep them from flying apart by sheer centrifugal force, or even just how to keep them all from ending their lives breaking big rocks into little ones at Leavenworth, try _The Rolling Stones_. It is not clear that Heinlein knew much about languages. As far as I can remember, his only foray into the subject was a one-paragraph rehash of the old von Humboldt classification system, already in disrepute at the time he set it down on paper, in _Double Star_ (1956); and even there, although he parrots words such as "agglutinative" and "polysynthetic," he does not explain what makes a "polysynthetic" language different from an "agglutinative" language. A more positive indication of Heinlein's beliefs about language is to be found in _Expanded Universe_, a collection of articles and essays. At one point he tells how he learned Russian and read several great Russian novels in the original after having originally read them in English translation. In English, he says, he found them uniformly depressing; in Russian, they are even more depressing, and so -- he concludes -- they obviously _gained_ something in translation. That authors such as Dostoevsky may have _intended_ them to be depressing seems not to have occurred to him. Heinlein's supposed familiarity with conlangs is based on the fact that, from time to time, he would drop the name of one of them in his works. I can't complain about that; a mention of Esperanto in "The Green Hills of Earth" was one of three factors that led me to Esperanto. But in fact the only times he ever indicated that he knew anything at all about any of those conlangs were two almost identical comments about "phonetic languages such as Esperanto" when talking about the feasability of voice-controlled typewriters (in _The Door Into Summer_ [1958] and _Farnham's Freehold_ [1961]). Other than these, I can think of only three (possibly four) mentions of conlangs in his works. There was his mention of Esperanto as the planetary language of Earth in "The Green Hills of Earth," written in the early 1940's. There was his use of Interlingua as a galactic interlingua in _Citizen of the Galaxy_ (1957). There was his use of Loglan as the language used to communicate with Mycroft, the supercomputer in _The Moon Is a Harsh Mistress_ (1965). And he may have mentioned Basic English in his written- to-order novella _Gulf_ (1948). Again, these are simply mentions, without any meat to them. Esperanto is not mentioned elsewhere in his other "future history" stories; Basic English is not mentioned in _Friday_, which was written against the same background as _Gulf_; and I do not remember Loglan being mentioned in _The Cat Who Walks Through Walls_, a sequel to _The Moon Is a Harsh Mistress_ -- or, in fact, in any of Heinlein's other books. Hence I'm a little surprised at lojbab's reference to "Heinlein's books" which mention Loglan. Nor, in any of these works, does Heinlein ever suggest that he knows anything at all (with the possible exception of Esperanto's phoneticity) about any of these conlangs. With respect to Loglan: I suspect that Heinlein saw the original SA article and skimmed it, and then, some three to four years later, remembered the name ("A Logical Language" -- Loglan), and decided that a language with such a name would be appropriate for communicating with a computer in a science-fiction story. This may have been the first mention of Loglan with respect to computers; I don't remember that JCB addressed the question in the SA article (in 1960, computers were arcane machines used only by physical scientists to crunch numbers). If so, it may be that the preoccupation of many Loglanists with the question of language and computers stems just from that one use of the language's name by Heinlein in 1965; after all, _The Moon Is a Harsh Mistress_ was far more widely read than JCB's article, or anything else that came after, and so it may have been felt necessary to associate Loglan with computers. As to how much influence Heinlein will continue to have on informing the public about Loglan/Lojban: while _The Moon Is A Harsh Mistress_ is in my opinion one of the best (perhaps _the_ best) of Heinlein's post-1959 works, this is not saying much; and, indeed, it is getting harder and harder to find these later books on your bookstore shelves, unless you have access to a specialty store. On the other hand, the "Future History" stories and the Scribners' juveniles remain perennially in print -- so people will continue to see mentions of Esperanto and Interlingua in Heinlein's books, where Loglan (and Basic English) will have largely disappeared. ============================================= Don Harlow Redaktoro Esperanto U.S.A. tel. (1 510) 222 0187 CompuServe [72627,2647] Internet 72627.2647@compuserve.com ============================================= Date: Thu, 7 May 1992 16:30:07 +1000 From: Major To: conlang@buphy.bu.edu Subject: response to Rick Morneau X-Mailer: GNU emacs 18.58 lojbab@grebyn.com (Logical Language Group) replies to Rick Morneau: > Rick only mentions this time around that he feels that Loglan/Lojban > is 'too complex' to be suitable for MT, that it takes too long to > parse. He also says that coming up with MT interlingua parsers are > easy, which seems contrary to fact, since there aren't a lot of > them, nor even a lot of people choosing the interlingua approach. 1. The aren't all that many people doing MT period 2. The interlingua approach will in the end be more productive but is in the short term much harder as it requires a more complete understanding of the source text. For this reason the IL approach is not used in any of the systems which are in production (as far as I know) but is being used in the more interesting research systems. 3. An MT interlingua is not necessarily a language in the normal sense most of the IL based MT systems have internal data structures to represent the text under translation which are nothing like human speakable languages. (move :agent (person :name "John") :obj (object :type desk) :time past) "John moved the desk" Such structures take ZERO time to parse, they are already in the internal form that the program wants. If a conlang wants to be used as an MT interlingua it will have to bring something that would justify the programmer going to the trouble of generating and parsing it. What it brings might be any combination of the following: 1. Understandability/Inspectability. If the programmer is a very proficient speaker of the conlang, then the conlang utterance may be more recognisable to him/her than the dump of the internal data structure and constitute a better/additional check that the understanding of the source text has been done correctly. 2. A predefined set of words. Not having to go to the trouble of developing your own dictionary is a significant advantage particularly for a research system. This advantage only works if there is a 1-1 correspondance between words and concepts. Almost every conlang claims this and I have yet to see one deliver. Note for Lojbab: Word includes lujvo and "x modified in some way by y" is too vague for this purpose. Another use to which MT people might put conlangs, particularly conlangs as rigorously defined as lojban is as a checklist. Your MT interlingua probably needs to be able to say all or most of what the conlang can say, checking each conlang feature and making sure that you have an equivalent expression is a valuable exercise for an interlingua developer. Lojbab hints at this here: > Taking tense as the more 'linguistic' of the two, an MT translatior between > a perfective tense-based natlang and English, had better have a sufficiently > complex model of the tense structures and their interactions, if it is to > serve adequately. Lojban's tense grammar was designed by a tense logician > who has studied the variety of tenses used in the world's languages. Lojbab continues: > I suspect that a human being speaking Lojban or another natlang-like > language can much more effectively communicate ideas than someone > trying to speak Lisp - because people don't think in Lisp. Speak for yourself! I became productive in LISP within a week of my first contact with it. The same is not true of lojban and not because I put less effort into lojban. The difference is that LISP (of itself) can only expess a very limited range of ideas (instructions for a computer). LISP data structures on the other hand (such as those in my example above) can express anything that can be expressed in a computer, the difficulty is working out the semantics of what an expression means rather than of getting the expression into the machine. > And it is far easier to look at a computer dump written in Lojban > and find errors than one written in binary code, Yes. > or in some ideal super-simple language. Maybe. Ultimately the data in the machine is ones and zeros, but only a complete massochist would try to examine them that way. The next level up is some LISP, PROLOG or whatever data abstraction, this is the level at which I as a programmer am used to working; if I was printing the data for a user rather than for myself I would translate it into some form that that user was likely to understand (say lojban or English). Major Date: Fri, 8 May 92 13:01:20 MDT From: mnu@inel.gov (Rick Morneau) To: conlang@buphy.bu.edu Subject: More on loglans and machine translation Cc: mnu@inel.gov Howdy conlangers! I'm sure that most of you are becoming as tired of this discussion as I am, so I'll try to keep this as short as possible. Bob raised several interesting issues, but I will limit my responses only to those which I feel are relevant to conlangs and their use as interlinguas (ILs) in machine translation (MT). Bob, I did not respond to John Cowan's final remarks on the suitability of Lojban as an MT IL because, without exception, they either ignored the points I made, they indicated that he failed to understand the points I made, they completely side-stepped the issues, or they were totally irrelevant. After two attempts, I gave up in frustration. As Major pointed out, there are no parsers for MT ILs, because the ILs themselves are kept in binary form. The whole point of my posts was to disprove the naive claim that loglans can fill this function. The points I made were based on knowledge of current MT technology and linguistics. NONE of the responses were similarly based. Your use of a logician as a credible expert in natural language semantics is totally unconvincing (What next? A theologian?). I've spent a lot of time studying Montague semantics and its offshoots (model theoretic and truth conditional semantics). When it comes to representing or analyzing all (or even most) of natural language, THEY SIMPLY DON'T WORK. At best, they are somewhat useful tools for representing and comparing the meaning of small subsets of language, and even this is debatable. Besides, I've already addressed the use of a so-called "logical" framework for dealing with natural language, and, like all of the other issues I raised, it was effectively ignored. (By the way, you state that the formalisms that you've "heard about" that are currently being used by formal semanticians are subsets of Lojban. Why don't you tell them about your work? It's very unprofessional of you to force them to re-invent the wheel. :-( Your discussion of tense was not really relevant to MT ILs, since ALL natural languages handle tense adequately. In fact, all current MT applications deal with tense simply and directly. However, if you want to learn more about tense in natural language, then I suggest that you learn about it from LINGUISTS rather than from logicians. I suggest you read the books "Tense" and "Aspect", both by Bernard Comrie, and "Mood and Modality" by F. R. Palmer. I also suggest that you use linguists (NOT logicians) as your sources for material on language universals. Greenberg, Comrie and Croft have written extensively on the subject, and I'd be happy to provide you with complete references. I intend no offense to your logician colleague, but, in designing a LANGUAGE, you'll do much better listening to linguists than to logicians. This is especially true if you want your language to be used as an interlingua in the automatic translation of NATURAL LANGUAGES. I won't address the parsing-complexity issue, since Major and Lars have already done so. I would like to add, though, that an IL that is to be given a readable and speakable surface form can be quite simple - to the extent that a much simpler LL or ATN parser could be used. It would also more easily allow for integration of concurrent semantic processing - something that would be much harder to do with a YACC parser. (I admit to being somewhat biased against the use of a YACC parser. I spent almost three years doing compiler development for the programming language Ada. At the very beginning I had hoped to save time by using YACC, but found it to be useless for our particular needs.) Bob, you wrote: "Only when Rick poses something that Lojban CANNOT do has he found a hole in our design." Unfortunately, this kind of reasoning is ubiquitous. It's also totally meaningless, and shows how little most people understand the MT process. After all, this test for robustness can be met by ANY natural language and, with appropriate lexical and/or syntactic extensions, by almost any conlang. In effect, you're saying that Lojban is as good (or as bad) an IL as English or Esperanto. That may be true, but so what? Robustness is only one of many requirements for an MT IL. HOW that robustness is implemented is MUCH more important, and, as I discussed in detail in my original critiques, it is in this area where the loglans fail miserably. You say that "there are many approaches to MT". I'm aware of only three: the direct approach (a brute force approach that is no longer being used for new development), the transfer approach (useful only when dealing with a SINGLE language pair) and the interlingual approach. (By stretching things a bit, I suppose you could say that Eurotra uses a fourth approach, since it's about halfway between a transfer scheme and an interlingual scheme. There is also some work being done using statistical methods, but this research is both extremely preliminary and somewhat controversial.) I would not use the word "many" to describe just two distinct and viable methodologies. Perhaps you had something else in mind. You also wrote: "But he condemns without understanding, and his arguments thus far have been perfectly valid, only if we would be planning on going about MT in exactly the way he seems to want to, which we aren't." Can I assume that you are about to shake up the world of machine translation with a new, awesome approach based on Lojban as an interlingua? Please tell us about it. I hope you will publish soon. Bob, you said that I don't know enough about Lojban to effectively criticize it. I don't know how much you consider "enough", but I have a stack of Lojban materials several inches thick which you sent me and which I read thoroughly. It was some time ago, and I certainly don't remember everything I read. But I learned enough about it to decide that it was unsuitable for use as an MT IL. If you insist that a critic must speak the language fluently before he can talk about it, then you'll be able to evade ANY form of constructive criticism - by definition. Your arguments about language complexity, redundancy, suitability for use by humans, et al, were all very interesting but too vague to latch onto. And even if I tried to focus your comments onto real MT problems, you'd only ignore it and respond with even more well-written and highly convincing vagueness. Besides, almost everything you said could be said just as easily about other conlangs with only minor changes in wording. Finally, Bob, in claiming that Lojban is suitable for use as an MT IL, you are making a major claim as well as a major COMMITMENT - and I don't think you realize just how serious this commitment is. Before you continue making claims in a field that is almost completely outside of your areas of knowledge and expertise, I suggest you learn a lot more about machine translation and its requirements. By making unjustifiable claims in a highly technical arena, you're simply setting yourself up for a nasty fall. Best regards, Rick -- *=*= Disclaimer: The INEL does not speak for me and vice versa =*=* = Rick Morneau Idaho National Engineering Laboratory = * mnu@inel.gov Idaho Falls, Idaho 83415, USA * =*=*=*=*=*=*=*=*=*=*= NeXT Mail accepted here! *=*=*=*=*=*=*=*=*=*= Date: 11 May 92 21:42:40 EDT From: Don Harlow <72627.2647@CompuServe.COM> To: Conlang Subject: Ia-zu and Neoispano To: Conlang >INTERNET:conlang@buphy.bu.edu Dato: 920511 For those who are interested in non-European, non-a-priori conlangs, the following letter came to me today. The following is translated from the original Esperanto. "Esteemed editorial staff! "A fellow Esperantist greets you! "Before everything I want to place an announcement in your newspaper or magazine. Whoever wants to learn a new language Ia-zu of Asian peace, let him send me a letter, please. We will send a fast textbook of Ia-zu for one week. After learning we will send a diploma from a Ia-zu teacher. Cost for learning is 20 US $, to be send to our central bank Ulan Bator No. 334521. Receipt by letter. "My address: Mongolia, Ulan Bator, P.O. 36, box 81, Tsedendambyin Bold." If anyone is interested, I recommend that they directly contact Mr. Bold (I don't know whether he speaks any language besides Mongolian and Esperanto) to find some alternate way of sending money -- I don't know just how one would go about paying through a bank in Ulan Bator, nor how much extra one would have to send for the bank's cut of the money. --- The February, 1992, issue of the Esperanto literary magazine "Literatura Foiro" arrived today. It contains a four-page article by Bernard Golden about a constructed language named "Neoispano", which is apparently a sort of Basic Spanish invented by one Gregorio Martines d'Antonana. The first textbook appeared in 1973. Sample text given in the article (or part of it, at any rate ...): "O lengua u idioma es o medio d'expresar nuestro ideas oralmente u por escrito. Pa faser un idioma fektibo i pr'aktiko, tiene k ser uniforme i klaro, sujeto a reglas. O konjunto d'esto reglas es lo k konstituye o gram'atika. ... "O letras se dibide en bokal i konsonante. O bokales es a, e, i, o, u. O resto es konsonante. Tuto o letras puede ser may'uskulo u min'uskulo. Se usa letra may'uskulo ao inisiar un frase; despu'es d punto final; en nombre proprio d personas i lokalit'es; en t'itulos d personas, libros, enkabesamiento d'art'ikulos d kualkier t'opiko, etc.s se denotmina: art'ikulo, nombre, berbo, adjetibo, pronombre, adberbio, preposisi'on, interjexi'on, seg'un su ras'on d ser." (All apostrophes, except the first in "d'art'ikulos," represent acutes over the following vowels.) If anyone is interested, Golden's article "has as its specific purpose the completion of the content of a work dedicated to analysis of planned languages based on Castillian." The book in question is: Ari, Valerio, "Il castigliano come base e mezzo di pianificazione interlinguistica," Bellinzona (Switzerland): Hans Dubois, 1983. ============================================= Don Harlow Redaktoro Esperanto U.S.A. tel. (1 510) 222 0187 CompuServe [72627,2647] Internet 72627.2647@compuserve.com ============================================= Date: Thu, 14 May 92 17:59:03 +0200 From: maxwell@ltb.bso.nl (Dan Maxwell) To: conlang@buphy.bu.edu As a linguist who worked for a few years in an MT project, I suppose I should make some attempt to mediate in the dispute between Rick and Lojban about the suitability of Lojban and other languages for use as an interlingua in MT. Rick mentions work on universals by Greenberg, Comrie and Croft and notes that Montague grammar doesn't work in MT. Fair enough, it and formal semantics in general are still in their infancy, so it probably is a mistake to spend to much time building it into your MT system at this point. The Rosetta Project in Eindhoven (Netherlands) did just this, as I understand it, and my impression is that they spent so much effort into creating this skeleton, that there wasn't much time left over to add the flesh and blood of linguistic facts. But this is only an impression. The results of work on MT are notoriously hard to evaluate, so hard that people are willing to devote an entire conference to attempting to develop methods for doing this. Rick makes the point that linguists should be consulted when searching for linguistic universals. True, but I should think that logicians have a contribution to make too. After all, Reichenbach was a logician, and yet his work has been the starting point for most formal work on tense by linguists - mistakenly so, perhaps Rick and others would claim, but this kind of work has produced some interesting results, and should with time get better. For a long time there has been a split - and sometimes unfriendly feelings - between linguists applying formal methods and working intensively on usually just one language (starting with Chomsky in syntax and Montague in semantics) and those looking less intensively at a lot of different languages or trying to understand the historical development and functional viability of specific languages. This tradition is usually traced back to an article by Greenberg, first published in 1963. Comrie is now one of the most well-known practitioners of this approach ( I'm not familiar with Croft's work). Formal methods are most directly useful to computational applications, since the linguistic rules (and the relationships between them) have to made mathematically precise and most applications don't deal with more than one or two languages. But in the long run, we need cross-linguistic work as well. I feel myself that the formalists have sometimes been more influential than they deserve to be on the basis of their results so far, but this is understandable given the relationship just noted between formal work and computer applications and the relatively large amounts of money available for computer applications of anything. Linguists not using formal methods have to search for funding in less affluent circles (eg, grants for studying cultural minorities, etc). Now I think these different traditions are starting to converge, although there is still a long way to go. But there is a lot more formal work on a wider variety of languages than there used to be, and people like Comrie are interested in the applicablity of formal linguistic models to the languages they look at. Rick says that not more than 60 production rules are necesary for any language. Maybe this depends on what counts as a production rule. The standard book on "Generalized Phrase Structure Grammar" (published 1985) has slightly more than 60 such rules (in this case, "Immediate Dominance Rules" and "Metarules") and is still a long way from providing a complete grammar of English. Volume 2 of work on "Head-Driven Phrase Structure Grammar" (due out this year, I believe) presents about 6 "Rule Schemata", each of which interacts with the lexicon to create a large, but unspecified number of production rules. The grammar there covers more English than the book on GPSG, but definitely not everything. I'm less familiar with Government-Binding. Computational linguists generally agree that it is not always clear how certain parts of this system work, and some linguists of the many who use GB admit that it is less well suited for computational applications. There was a claim some years ago that its practitioners no longer needed more than a few phrase-structure rules (the most well-known type of production rules), but there doesn't seem to be general agreement on what has taken their place. Anyway, I don't think these frameworks are as relevant as they are sometimes thought to be for MT, since MT needs generation only in a restricted sense. The source language is given: the system has to analyze it rather than generate it. To get from the source language to the target language (or intermediate language, if there is one), DLT uses rules which transform parts of the source language representation into appropriate target language representations. The rules used for this in DLT were rather similar to transformations used in the so-called "Aspects" model of transformational grammar. We got a degree of "understanding" (the hardest part of MT), though only at the sentence level, by first trying to anticipate all the ways any structure in the source language could be translated into the target language, and having rules available for all of them. The presence of ambiguity in the two way dictionary (many words in one languge will have more than one translation in the other) add to the number of possible translations. We then have a knowledge bank in Esperanto which provides implicit information about the relationships between the words in the various possible translation. We use this information as the basis of an algorithm to choose between the possible translations developed by the transformations. Does the grammar of the IL need to be simple? It has to be complex enough to be able to represent the meaning of the input language. In the approach outlined above, the closer its grammar is to the grammar of the languages which it has to be translated to and from, the fewer "transformation" rules need to be written. How close it can be to these various languages depends partly on how close they are to each other. But writing these rules is not the hard part of the translation process in DLT, so for us this was not a major issue. If Lojban has a lot of words which correspond to something in logical representations, but not in natural languages, then this would make the transformation rules somewhat more complicated, but probably not intolerably so. If such words can make the choice between alternative translations created by the transformation rules easier to make, this would make them worthwhile. But I don't see that this is the case. Ambiguity remains in the source language or in its relationship to the IL, no matter how explicit the IL is. Dan Maxwell Date: Thu, 14 May 92 15:24:13 CDT From: dean@anubis.network.com (Dean C. Gahlon) To: conlang@buphy.bu.edu Subject: Volapuk affixes Here at long last is the list of Volapu"k affixes. This list is taken from the one in the _Dictionary of Volapu:k_, by M. W. Wood, copyright 1889. (There was another page of numerical derivative affixes that is not included in this list; if there's demand, I may type that in at a later time.) ------cut here----- Prefixes a- Sign of the present tense, omitted in active voice, indicative mood; sign of the present time a"- sign of the imperfect tense, and of past time, more particularly of the recent past ai- sign of the present aorist, or frequentative, denoting continuous, habitual, or enduring action a"i- sign of the imperfect aorist ba- contraction of "bal"--one ba"- lower; nether; down ba"le- old; of old be- intensive prefix, very much as it is often used in English; converts an intransitive verb into a transitive one begi- beginning; opening; initial bei- by; past beno- well; good; agreeable bevo- between; inter- bevu"- between; inter- bi- fore; for-; pre- bise- in advance; ahead bive- forwards bizu- prior bla- vice- bla"- black bu- pre-; before; preceding bu"- pre-; before; prior bu"nu"- in; within(locally) bu"o- former; prior da- completion of an action; attainment of a purpose; sometimes a strengthening prefix, sometimes indeterminate da"- completion of an action de- from; out of; away denu- re-; anew; again des- here; this, etc. dese- out; hence; from here di- destructive action; change produced by the action dil- part; partial dilo- part; partial dis- lower; nether; below; under disa- lower; nether; below; under disi- lower; nether; below; under diso- lower; nether; below; under do- dark(color) do"- per du- through du"- hard e- sign of the perfect tense, of completed past time ei- sign of the perfect aorist fe- away; aside; off fen- oft- (denoting frequent repetition) fi- fixed; arrested fla"- fre; prepaid flu- flowing; liquid fo- before; previously; anticipation fo"- ??? fo"a- front; in front; fore ge- back; back again gi- aright gle- great; head; arch; principal, etc. gleo- great; head; arch; principal, etc. glu- lesser; lower; worse i- sign of the pluperfect tense, and of time completed before some past time ii- sign of the pluperfect aorist ita- self iti- self ito- self ji- general feminine prefix jo"- beautiful; fine, etc. ke- con-; with; together with; (sometimes exchanged with ko-) ki- interrogative prefix kli- field-; war- (milit.) ko- co-; con-; with; together with; (sometimes exchanged with ke-) kos- toward (to meet) ku"- cubic la"- on; at; to; with lafa- half; step-; semi- lai- continual; standing, etc. le- general strengthening prefix; more; greater; larger; higher, etc. lei- equal; like; etc. len- at; on; to li- general simple interrogative lio- general simple interrogative lo"- above; upper; over lo"pa- above; upper; over lo"pi- above; upper; over love- over; over to lu- depreciating diminutive; lesser; worse; smaller; fewer, etc. lu"- to; toward lu--ik -ish luk- hither mai- open mi- mis- mo- away; out mo"- poly-; much; many mo"--ub profusion mo"da- much; many; manifoly; poly- mo"di- much; many; manifoly; poly- mo"do- much; many; manifoly; poly- mo"du"- most- nal- like; after, etc. ne- general negative prefix neba- branch; side; accessory nei- net; neat; pure ni- near nil- near nin- in; into no- noble nu- (geographical prefix) new nu- now; present nu"- into; in o-! sign of the vocative o- sign of the first future tense and of future time oi- sign of the future aorist om- general masculine prefix o"m- emasculated p- general sign of the passive pa- sign of the present passive pa"- sign of the imperfect or past passive pai- sign of the present aorist passive pal- double pe- sign of the perfect passive pei- sign of the perfect aorist passive pi- sign of the pluperfect passive pii- sign of the pluperfect aorist passive plu- more (intensive) po- sign of the first future passive poe- pseudo poi- sign of the first future aorist passive po"--o"l the gerundive pos- past-; after; subsequent pu- sign of the second future passive pui- sign of the second future aorist passive sa"- dis-; out; from; away, etc. se- from; out; away sem- some; any sene- outward; exterior seo- out si- astronomical prefix; names of constellations sil- astronomical prefix; names of constellations sma- diminutive; young; small; little; dwarf smal- diminutive; young; small; little; dwarf sne- undulatory; serpentine; here and there so- such; so soa- isolated; solitary sti- honorary; honorable stima- honorary; honorable su- on; at; upon su"- forth; out sui- up; at; out sun- early; soon sus- upward; hitherward susi- up; upward suso- upwards; thence upwards susu" upward; thitherward ta- contra-; counter tu- too; excessively; all too u- sign of the second future, and of time subsequent to some future time ui- sign of the second future aorist u"l- primeval; primitive; antique; original; first va- square va"- every; all, etc. ve- along vi- woe vie- white vifa- swift; express; fast vina- wine- vio- how; however (relative) vo- other vo"- former; old vo"no- former; old vota- about; changed voti- about; changed voto- about; changed vu"- between; inter- xo- out; ended; used up yu"- blue zi- circum-; about zu"- all about; around suffixes -a sign of the genitive singular -a as an adverbial termination, indicates direction from -ab names of kinds of money -ab some concrete neuter nouns -a"b personal noun -ad indeterminate concrete noun -a"d concrete neuter noun (-ad and -a"d are frequently interchanged) -af zoological termination -a"f botanical termination; names of flowers -ag indeterminate noun -al personage; official; superior; distinguished -a"l denotes peculiar mental characteristic, and rarely a person distinguished by such characteristics -am denotes participial noun, an action -a"m denotes an office, a position, and sometimes a second participial noun -an person; one who is--; native or inhabitant (-el was formerly used for -an in many cases) -a"n geographical termination; names of subordinate countries, provinces, states, etc. -ap anatomical termination -as sign of the genitive plural -at generally indeterminate concrete noun -a"t -ate; -cy; -ity -av names of sciences -a"v names of arts -bik sign of the adjective -del names of days of the week -dil part; -th -e sign of the dative singular -eb indeterminate -ed indeterminate -ef collective noun; personal -eg indeterminate -ek indeterminate -el personal; one who does or makes something (formerly used in place of -an) -em collective noun; neuter; a collection of -- -en business; occupation; behavior -ep botanical termination; names of plants -es sign of the dative plural -et names of things, especially of foods; sometimes indeterminate -ev termination for technical poetical terms -fik sign of the adjective -gik sign of the adjective -i sign of the accusative singular -ip denotes a concrete neuter noun -id sign of the numerical adjective (ordinal) -th -ido sign of the numeral adverb -thly -iel apparatus; machine; contrivance; thing which does something -ik sign of the adjective -ikos something; that which is --; the -- -il denotes the diminutive -im generally; -ism; but sometimes indeterminate -in generally used for the chemical elements, and other elementary substances; but also sometimes used indeterminately -ion -illion -ip denotes names of diseases and diseased state -is sign of the accusative plural -it generally denotes names of birds; but sometimes used indeterminately -iv denotes the essentials of -l termination of the simple numerals -lo"f denotes an abstract quality -na numeral adverb; -- times -nik sign of the adjective -o sign of the derivative adverb -o" sign of the derivative interjection -ob sign of the first person singular -o"b indeterminate -obs sign of the first person plural -od indeterminate -o"d sign of the imperative -of sign of the third person singular feminine -o"f denotes an abstract quality or condition; -ness; -ity -ofs sign of the third person plural feminine -og indeterminate -o"g -bone -oin mineralogical terminate -ok sign of reflexive verbs -o"k denotes a need of, a necessity for -ol sign of the second person singular -o"l sign of the participle -om sign of the third person non-feminine -o"m collective noun; furniture; utensils, etc. -oms sign of the third person plural -on sign o fthe indefinite subject; one -o"n sign of the infinitive *-ons sign of the second person; "form of courtesy" -op termination of names of continents -o"p denotes name of a locality -os sign of the impersonal verb (see also -ikos) -ot concrete thing made or done -o"v sign of the conditional -o"x sign of a form of potential with "might" -o"z sign of the jussive, an unconditional imperative -s sign of the plural -sik sign of the adjective -su"k denotes a passion for; -mania -t sign of the demonstrative pronoun -ta"t denotes a condition or quality; -ity -tet denotes an abstract quality -tim names of seasons, and some other nouns of time -u sometimes used with a simple root for the formation of a second derivative preposition -u" sign of the derivative preposition -ub indeterminate -u"b indeterminate -u"d termination of the names of the cardinal points of the compass; also used indeterminately -u"del days of the week -uf used to denote a disposition, an abstract condition -u"f denotes a musical term -ug denotes a characteristic property -u"g -ship; -hood; and also used indeterminately -uk termination of the names of fruits -u"k generally to denote a prerogative -u"ko"n forms verbs from adjectives; to make -- -ul termination of names of months; also used indeterminately -um sign of the comparative -u"m denotes a musical composition -umo sign of the comparative of adverbs -un product, of manufacture, etc. -u"n sign of the superlative -u"no sign of the superlative of adverbs -up termination for names of plants which are not utilized (-ep for those made use of) -u"p termination for names of periods of time -ut termination for names of machines -u"t termination for names of fishes, particularly for those preserved for eating -ved direction toward; -wards ----- cut here ----- Dean Gahlon dean@ns.network.com Date: Mon, 18 May 92 20:02:45 -0400 From: brgilson@highlite.gotham.COM (Bruce R. Gilson) To: conlang@buphy.bu.edu Subject: INTAL 1978 I've distributed the 1970 version of the INTAL standard grammar to a number of you in the past. It seems that, unknown to me, Weferling was gradually modifying the language; I received in the mail today a copy of the 1978 edition. This is stated to be the 23rd edition, and since the 1970 edition was the 2nd, Weferling must have been very concerned with perfecting INTAL. Actually, INTAL 1978 and INTAL 1970 do not appear to be different in a really significant manner. The noun endings -o( common), -u (masculine), -a (feminine) have been replaced by -e, -o, -a (which agree with Novial and my own preference), and the final -e on nouns is more consistently used. The passive of the verb is formed with fi instead of bli. Verbal substantives are in -a (infinitive or noun to designate the action), -e (concrete notion) so "le planta" = "the planting" but "le plante" = "the plant." I don't have the 1970 version in front of me as I write this, but except for those forms, that seems to summarize the differences. I would say that they made for an improvement. I was sent this pamphlet by Tom Wood, the developer of Interling. In corre- spondence with me, he indicates that he considers himself to be working in the sirit of Weferling; however, I feel in some ways Weferling's 1978 Intal came closer to my ideal than 1992 Interling. However I intend to continue contact with Wood, and perhaps I can influence him to revert in those matters. In any case, I would like to advocate 1978 Intal as opposed to the 1970 version which I have up to now been distributing. Date: 19 May 92 09:10:00 EDT From: "61510::GSIP" Subject: More complete comparison between 1970 and 1978 INTAL To: "conlang" Comparison between the Second (1970) and Twenty-Third (1978) Editions ========== ======= === ====== === ============ ======== The Second Edition is entitled: "Standard-Gramatik del International Auksiliari Linguo," and the Twenty-Third is entitled: "Standard Gramatike del International Planlingue INTAL." For purposes of this document, they will be referred to as INTAL-1970 and INTAL-1978 respectively. Alphabet: In INTAL-1970 it states "without supersigns" but in fact an acute accent is optionally used when required to show that the accent does not follow the standard rule; in INTAL-1978 this is made explicit on p. 4 in the descrip- tion. (In INTAL-1970, in fact, it is shown under Orthography; in INTAL-1978 in both places) Orthography: Identical except that INTAL-1970 permits a period to show a short vowel as well as a colon to show a long; INTAL-1978 omits the period. Endings: Nouns in general end in INTAL-1970 in the Esperanto/Ido -o, with a masculine form in -u and a feminine in -a for words denoting living beings. In INTAL-1978 this has been changed to Novial's -e (general), -o (masculine), -a (feminine). This causes some other changes, for example the -eso ending for qualities becomes -ese. In both versions, the noun endings (-o/-u/-a or -e/-o/ -a) and the adjectival ending -i are omissible "if the sense is clear." For verbs, the infinitive in INTAL-1970 has an -ar ending, and concrete/ abstract notions in -o. In INTAL-1978 the infinitive ends in -a (like the present tense) and the concrete notion (e. g. "plant" from the verb "to plant") in -e. A new ending -im for deriving adverbs from adjectives appears in INTAL- 1978. The -men (manner) ending of INTAL-1970 is dropped, but in most cases this would seem to replace it. Numbers: "Des" in INTAL-1970 becomes "dek" in INTAL-1978. The ordinal ending is changed from -esm(i) to -ti, the fractional ending from -im to -ime, and the group ("by twos") ending from -ope to -opim (in accordance with the -im adverb- ial ending). Pronouns: By and large unchanged. "Mi" has become "me" and the possessive forms are more regularly formed, all except the third person by adding -i to the nominative/accusative. "Su" has become "su(i)" and the forms il-su, el-su, etc. have become ilsui, elsui, etc. (Several other forms have also lost their hyphens, but that is another story.) The abstract pronoun "lo" has been dropped. Verbs: The active forms have not changed, except that the imperative, which in INTAL-1970 could be either identical to the present or have an -u ending repla- cing the present -a, now always has -u. The passive of stasis is formed as be- fore with the auxiliary "es," but the passive of becoming, in INTAL-1970 formed with "bli," is in INTAL-1978 formed with "fi." In INTAL-1978, "vas" is suggested as a short form alternative to "did es." Stress: The main rule in INTAL-1970 is to accent the penult in words ending in a voewl, the final syllable in words ending in a consonant. In INTAL-1978 it is to accent "the vowel before the last consonant" (the Novial rule), which actually gives the same result, more concisely. Certain endings are ignored in determining the accent; the list is slightly different in the two versions. Language name endings: INTAL-1970 in -al (Anglal, Espanal, Fransal, etc.); INTAL-1978 in -um (anglum, espanum, fransum, etc.); note also that in 1970 they are capitalized as in English, while in 1978 they are not. Questions: In INTAL-1970, questions are formed either by inversion ("Konosa tu le vir?") or with Novial "ob." In INTAL-1978, the Ido (and Japanese, but in Japanese it comes at the end, and in INTAL, like Ido, at the beginning) "ka" is used ("Ka tu konosa ti viro?"). Inversion has been dropped as an option. Brief forms: In INTAL-1970 the auxiliary verb "es" can be omitted; this option no longer is stated in INTAL-1978. There are a number of vocabulary changes -- "reglari" replacing "regulari" as an example, but the vocabulary was never fully fixed even in 1970, with both "skriba" and "skripta" appearing in INTAL-1970. Therefore, this cannot be con- sidered a significant change. Date: 19 May 92 09:37:00 EDT From: "61510::GSIP" Subject: Comparative text in 1970 and 1978 INTAL To: "conlang" Weferling has several poems and other passages in the two editions. However, I have found only one text that was in both. It is the Schiller "Ode to Joy," which Beethoven incorporated in his 9th Symphony. The 1978 version has a few extra lines, but I am sending them in their entirety. You will see that the 8 years brought some changes, but not many. 1970: Al Joyo Joyo, bel sintilo dei, Filia ek Elysium, Nos vol entra fairo-ebri Tui santuorium. Tui sorsios liga ri, Ko par modo separat; Omni homos eska fratos Ku tu, joy, bli kultivat. 1978: Al joye Joye, bel deal sintile, Filia ek Elizium, Nos vol entra fairo-ebri, Tui santuorium. Tui sorsies liga ri, Ko da mode separat; Omni homs devena frates, Ku tu joy' fi kultivat. Esu embrasat miliones! E ti kis al toti mond'! Frates, super steles-rond' Es un Deo in eones. Bruce (Please send responses to brgilson@highlite.gotham.com; this address is not my own account.) Date: Fri, 22 May 92 08:46:11 -0400 From: ross@buphy.bu.edu (John B Ross) To: conlang@buphy.bu.edu Subject: Taneraic translation as promised Here's the translation to the above "North Wind and Sun" story from Javant Biarujia, complete with pronunciation guide and gloss. Enjoy! -- JR PS Taneraic was created by Javant for his own pleasure and certainly not as a candidate for auxiliary world language or for MT applications. He believes in creating languages as a art-form and, in doing so, Taneraic grammar is quite irregular and ideosyncratic. Vive la difference! :-> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Tanerai ['p' & 't' slightly aspirated. 't' & 'd' generally dental. 'b' & 'g' somewhat plosive. 'c' is an affricate (sounds like 'ts' in English 'mats'). 'cy' is an affricate (sounds like a lean English 'ch'). 'sy' is an affricate (sounds like a lean English 'sh'). 'j' as in French 'jour'. 'q' not aspirated and may have a little voice; at the end of a word is a glottal stop. 'x' back articulation sounds like 'ch' in Scottish 'loch'. 'l' is never dark. 'r' is a lingual roll. 'n' is syllabic; finally it is pronounced softly as in French 'son'. 'a','e','i','o' (sounds like 'o' in French 'porte') & 'u' are about cardinal. 'ai','au','oi' & 'ou' (sounds like 'o' in English 'hope') are diphthongs. 'ei' diphthong sounds like 'ai' in French 'main'. 'eu' diphthong somewhat like French 'oeuf'.] "Breqa Paqan e Ledub pu neuhabuva qemanatta nuyole leisoubda cer -- ayo nun i ayo nunien, go nuri baniaris nu uma alinmoutatta nas dibo das na bemieu vejerda. Yos yosabanda qemani peta yosexasatta puno oher mas yole baniaris uma amaxiraratti nas pepeta uza sancyagga e leisoubda cer veqis. Evon Breqa Paqan cigosda e peleisoubati, dus tari yocigosda tatari baniaris uma aiqolatti nas tuhu'; peibosuti otai elenisti Breqa Paqan lingandi bouainat. Evon Ledub sejir yarqada desgeji busai baniaris uma amaxiraratti nas dib uzeus lesega. Ji Breqa Paqan mas eyonustadi esmounegiatten Leduba." Sentence One: 'Wind North' (two substantives in apposition = 'The North Wind' -- the formal genitive , which merely describes but does not define, gives the from 'Breqa Paqana' without any "improvement" on expression; on the other hand, 'Paqani Breqa' translates as 'North[ern] Wind' -- such forms of apposition are commonly used in Taneraic: 'cp noussan bihar','[blue] jeans'; 'bihari noussan', 'blue trousers') (capital letters are the convention for turning common nouns into proper nouns) 'and' (particle that links terms belonging to the same family) 'Sun' (='the Sun'; there are no articles in Taneraic) 'once' (past imperfective auxiliary = 'once';'one-time';'used-to' -- here translates as 'were'; verbs are not declined -- instead, they are modified by auxiliaries to indicate tense and mood) 'together-discussion-reason-be' (compound intransitive verb with affixation = 'disputing';'arguing';'contending' -- the two unbound radicals are 'habuva' and 'qeman'; 'neu'- is a free morpheme which requires the suffix -'at' for ligature, and -'ta' indicates a grammatical intransitive inflexion) 'which[-one]' (a weaker form of 'nu', 'who' which follows or precedes intransitive verbs) 'strong-be more than' (='which was the stronger' -- adjectives are made attributive by adding the grammatical intransitive inflexion -'da'[-'ta' after 't']; comparison is expressed in this case by the use of a particle following the attributive adjective) -- '[s]he-that' or '[s]he-this' (='the former' or 'the latter'; the third person does not differentiate gender or number -- this phrase has been inserted to obviate ambiguity; 'i' is a particle that links terms belonging to the same family), 'when any' (=a certain) traveller' (radical is 'bana', travel; -'yaris' is a suffix which denotes a habitual agent) 'who' (pronoun which may introduce interrogation) 'self in-fold-do' (='was wrapped up' -- the radical meaning of 'uma' is 'direction' or 'toward', and is used to form the reflexive mood; the radical 'inmout', 'fold', is prefixed with the free morpheme 'ai'- ['al'- before vowels] 'in') 'coat-heat-at' (='in and overcoat'; the compound radical is 'bemieu vejer', 'approach-pass'; 'das' is one of the three special function verbs, with the general meaning of 'have the quality of', used to help compounds become verbs; 'na' is a particle which indicates the immediate past). Javant Biarujia GPO Box 994-H Melbourne 3001 Australia Date: Fri, 22 May 92 22:00:47 -0400 From: brgilson@highlite.gotham.COM (Bruce R. Gilson) To: conlang@buphy.bu.edu Subject: Reflexive possessives Rick Harrison wrote, and I'm sorry to take so long to respond: >Lately there has been a discussion of reflexive possessive >pronouns in sci.lang, with examples from several natlangs given, >e.g. from Swedish: >Han s"aljer sin bil. (He sells his [own] car.) >Han s"aljer hans bil. (He sells his [someone else's] car.) >Is it possible to make the same distinction in conlangs? I seem >to recall Esperanto having a reflexive pronoun, but I'd like to >know about the others: Volap"uk, Interglossa, Glosa, Ido et al. In Intal, there is a clear distinction between these two cases. "Su" [su(i) in 1978 Intal] always refers back to the subject. The forms "il-su," "el-su," etc. in 1970 Intal, and "ilsui," "elsui," etc. in 1978 Intal, refer back to another person than the subject. Bruce From: "Edmund Grimley-Evans" Date: Wed, 27 May 92 12:47:03 +0200 To: nsn@MULLIAN.EE.MU.OZ.AU Cc: conlang@buphy.bu.edu, dfkihueg@rz.uni-sb.de Subject: North Wind tale > "The North Wind and the Sun were disputing(1) which was the stronger(2), > when a traveller came along wrapped in a warm cloak(3). They agreed that > the one who first succeeded in making(4) the traveller take his cloak(5) > off should(6) be considered stronger than the other(7). Then the North > Wind blew with all his might(8), but the more(9) he blew the more > closely did the traveller fold his cloak(10) around him; and at > last(11), the poor North Wind gave up the attempt(12). Then the Sun > shone out warmly(13), and immediately(14) the traveller took off his > cloak. And so the North Wind was obliged to confess(15) that the Sun was > the stronger of the two(16)." Norda Vento kaj Suno disputis, kiu estas la plej forta, kiam voja^ganto preteriris volvite en varma mantelo. Ili konsentis, ke, kiu unue sukcesos senigi la voja^ganton de lia mantelo, tiun oni konsideru pli forta ol la alia. Do Norda Vento blovis per ^ciuj siaj fortoj, sed ju pli li blovis, des pli la voja^ganto envolvis sin en sia mantelo; kaj fine kompatinda Norda Vento rezignis. Tiam Suno brilis varme, kaj tuj la voja^ganto deprenis sian mantelon. Tiel Norda Vento devi^gis konfesi, ke Suno estas la plej forta el la duopo. ---------- It took me about five minutes to do that translation. Because Esperanto is a "living language" with considerable history, I was able to translate not just the meaning but also the style. Notes: (1) I decided not to use articles for "Norda Vento" and "Suno". That strenghthens the personification and adds to the parable/fable/fairy-tale atmosphere. (2) "mantelo". There are at least half a dozen Esperanto words meaning "coat", "cloak", etc. If I had time I might consult an eymological dictionary to pick a suitably old word, but "mantelo" has roughly the same semantic field as "cloak", as it implies sleevelessness. (3) "kiu unue ..., tiun ...". This word-order is not normal in conversation, but is very common in the Esperanto Bible, in proverbs, etc. Adds to the atmosphere, I hope. (4) "devi^gis". I'm not very happy about this word, as such usage of the suffix "-i^g-" sound too modern, but I don't have time to do anything better at the moment. It's just occurred to me, that since this story is well-known and definitely not modern, it has almost certainly already been translated into Esperanto. Probably several times independently. In fact, the very fact that I recognise it, implies that it exists in Esperanto, because most of the books I read nowadays are in Esperanto. What a waste of time, me translating it again! Looks like I've already wasted 11 minutes of my time that I could have spent working for the Unity of Mankind, so I'll stop now... In the unlikey event that anyone wants to react to this, bear in mind that I don't actually read the "conlang" list. ======================================================================= Edmund GRIMLEY EVANS dfkihueg@rz.uni-sb.de ======================================================================= From: And Rosta Subject: conlangs & MT Date: Sun, 31 May 92 18:40:32 +0100 What exactly are the relevant issues in deciding on whether such and such a language is of utility for working with computers in general and for machine translation in particular? [This is not a rhetorical question; it is a statement of my ignorance.] I imagine that it is assumed that the AL would serve as an interlanguage, right? In this case, I guess the following are very important: 1. unambiguous syntax & morphology 2. elaborately & rigorously documented grammar 3. elaborate & comprehensive semantics 4. elaborately & comprehensively documented semantics I can claim the importance of (1) from personal experience: I've colleagues parsing a corpus of English texts, & it is not unusual for the (computer) parser to find a few hundred grammatical parses for the same sentence, which requires a human to come along & choose the correct (i.e. intended by the speaker) parse. It may, however, be the case that English is especially ambiguous: I have heard it claimed that Sanskrit parses with next to no syntactic ambiguity, & I have a friend who wrote a parser for Nama that, she said, did not produce an explosion of alternative parses. (I.e. Nama is therefore allegedly much less prone to syntactic ambiguity than English.) As far as I know, of all languages, natural & invented, none come anywhere near passing tests (1) & (2), except for Lojban, which passes with flying colours, aplomb, etc. As for test (3), natural languages have the lead, but in those areas of Lojban whose semantics have been worked out, I reckon that Lojban could claim to have the edge, at least in its ability to be less semantically ambiguous in matters of, e.g., quantification, than nat lgs. No language scores very highly on (4), but since it's easier to document an invention than a discovery of tacit knowledge (i.e. nat lang semantics), an AL could have the edge, & again Lojban probably does the best out of the ALs. (I mean only published ALs, not ALs that don't exist but we could imagine.) The above surely can't be so controversial, so - since it doesn't seem universally agreed that Lojban is way in the lead as an exceptionally computationally tractable language - I guess I must have misunderstood the issues that have been being discussed on this list. --- And From: jwt!bbs-hrick@peora.sdc.ccur.com (Rick Harrison) Subject: spilling the Yalo beans Date: Mon, 01 Jun 92 03:14:13 EDT Organization: The Matrix A couple of folks have asked "what the heck is Yalo?" It is a conlang, still under development. A peek at Yalo morphology: C1 C2 V -- -- - p b a c j e (c = "sh" in "ship," j = "j" in French "bonjour") t d i s z o f v u k g r l n m h w y Four types of morphemes are possible: 5 consisting only of vowels, which are interjections. 55 consisting of a consonant from column C1 followed by a vowel. 55 consisting of a C1 + a vowel + "ng" indicating nasalization. 2200 consisting of a C1 + a vowel + a C2 + a vowel. Thus, "o!" "ti" "kang" and "yalo" are possible words, but "ing" "loya" and "pikang" would not be allowed. This morphology has the following advantages: No consonant clusters. Up to 2315 morphemes which are only 1 or 2 syllables long. Self-segregating words (a degree of "audio-visual isomorphism"). That's all (if not more than) I'm at liberty to disclose. hrick === ``All generalizations are necessarily untrue, including this one.'' - Gene Burns Date: Mon, 1 Jun 1992 19:28:55 +1000 From: Major To: conlang@buphy.bu.edu Subject: conlangs & MT (replies to Harlow & Rosa) X-Mailer: GNU emacs 18.58 Don Harlow <72627.2647@compuserve.com> writes: > Does this really mean anything? C has a parser (or parsers); Prolog has a > parser (or parsers); but how many programs are there out there that can > translate C source code to Prolog source code, or vice versa? Or is it just > that nobody feels any great need to translate between source codes in > two different high-level computer languages? On the contrary, I know that there are translators from LISP and pascal to C and would not be at all surprised to find one from PROLOG to C. Furthur, all compilers are in fact translators from one computer language (say C) to another (say VAX assembler or VAX machine code). There are also translators from English to Spanish (and dozens of other language pairs): the difference between the programming language translators and the natural language translators is that the purchasers of the programming language translation programs expected those programs to produce 100% correct translations where as the users of the natural language translation programs expect only a rough draft from which they can produce a completed translation. The reason for this difference in expectation is that natural language translation is a much harder problem than programming language translation. Why is this so? 1. Programming languages have a much narrower scope than natural languages (or most conlangs); 2. Programming languages have much simpler syntax than natural languages (some conlangs, notably lojban, fall closer to programming languages that natural languages on this scale). 3. Programming languages have a fully-defined semantics; no natural language or conlang that I am aware of shares this feature. I strongly suspect that a conlang which did would be unspeakable, both literaly and figuratively. And Rosta writes: > I imagine that it is assumed that the AL would serve as an interlanguage, > right? In this case, I guess the following are very important: > > 1. unambiguous syntax & morphology > 2. elaborately & rigorously documented grammar > 3. elaborate & comprehensive semantics > 4. elaborately & comprehensively documented semantics > > As far as I know, of all languages, natural & invented, none come anywhere > near passing tests (1) & (2), except for Lojban, which passes with > flying colours, aplomb, etc. lojban inherits these properties with only minor improvements from Institute Loglan. gua!spi, also derived from loglan, would also share them. > No language scores very highly on (4), The languages created for the purpose of being MT interlinguas (as opposed to speakable conlangs) do score well here. They are, on the other hand, totaly useless for human-human communication > but since it's easier to document an invention than a discovery of > tacit knowledge (i.e. nat lang semantics), an AL could have the > edge, & again Lojban probably does the best out of the ALs. This would be true if lojban was still a project, IE still being "invented", but it is not. Lojbab has repeatedly stated that he wishes lojban to become a "real" language the meaning of which is discovered by its speaker community, not invented by him or his cohorts. As such, lojban is only very slightly better off on (4) than English. Major Date: Mon, 8 Jun 92 13:51:16 -0400 From: jross@bu-conx.bu.edu (John Ross ) To: conlang@buphy.bu.edu Subject: EmSighAy -> EhmayGheeChah Dear conlangers, Many of you are no doubt acquainted with the work of one Elmer Joseph Hankes, ie the language "EmSighAy" or ___ ____ | | | | | | |---- | | | . | | | ____| | in his orthography. I wrote him a while ago and received no response till last week when I received a package from his organization, The Hankes Foundation. The envelope was interesting in and of itself. In a cartouche beneath the return address was written "TO THE KEEPERS OF THE FLAME, WHO FOSTER AND PROPAGATE KNOWLEDGE TO GUIDE OUR POSTERITY TO AN EVER BRIGHTER FUTURE. HERE IS THE THE [sic] NEXT SIGNIFICANT HUMAN ADVANCE --- UNIVERSAL NAMES FOR NUMBERS and A UNIVERSAL SECOND LANGUAGE. LET THESE IDEAS BE KNOWN TO THE BRIGHT YOUNG MINDS IN YOUR CARE SO THAT THEY MAY DO WHATEVER THEY CAN WITH THEM." Beneath the cartouche it says " * POSTMASTER * PLEASE DELIVER TO THE PERSON OCCUPYING THE POSITION MENTIONED." Err, I guess that's me. :-) Elmer or his staff sent me a preliminary edition (May '92) of his book titled ---------------------------------------------------------------------- ____ ____ | | | | ) | | | / |---- | | / \ / | . | | / X | ____| | /____ /.\ A Universal Second Language By Elmer Joseph Hankes PRELIMINARY EDITION FOR RESEARCH PURPOSES ONLY Persons wishing to learn this language and communication system should obtain a copy of this first edition due in 1993/4 --------------------------------------------------------------------- The first line reads "Eh m ay ghee chah" which means "second language". It represents the second version of Hankes' "EmSighAy", published in 1982. And, yes, the fourth letter above is a 2! " The alphabet consists of 20 vowels, 20 consonants, and 16 POTENTS", says Hankes in his introduction. Potents are prefixes which signal to type of word which follows like verb, time period, measurement, masculine, feminine, place name, family name, company name (!), badness (!!), et al. The vocabulary is constructed in a "Ro"-ordered fashion; namely, words are arranged according to semantic classes marked by their first syllable. So, "awk-ay" is "govern", "awk-wigh" is "authority", etc. To make to verb "to govern", you need to find the potent which marks verbs. The Foundation also sent a cassette tape for pronunciation drills and a handy set of alphabet flashcards. This has got to be one of the best pieces of linguistic exotica I've run into in a while. And it was free. The address is: The Hankes Foundation 1768 Colfax Ave. South Minneapolis, Minnesota 55403 - 3001 U S A -- JR Date: Wed, 10 Jun 92 14:47:40 -0400 From: ross@buphy.bu.edu (John B Ross) To: conlang@buphy.bu.edu Subject: Are you Kayolonian?? Not long ago, John Chalmers mentioned that a Dutch physician named Hans Barnard and some of his friends created a language, culture, music, history, costume, et al. of the "Kayolonians". In fact, JC said that the "Friends of Kayolonia" have performed on "authentic" Kayolonian instruments in Amsterdam in the Kayolonian language!! Alright, John, you *must* tell us more! If anyone else out there in conlangia (esp. you who live in the Netherlands) knows more about this, ple-e-e-z share it with us!! -- JR Date: Fri, 12 Jun 92 04:54:25 PDT From: chalmers@violet.berkeley.edu (John H. Chalmers Jr.) To: conlang@buphy.bu.edu Subject: Kayolonia John: Information is available in Dutch, Kayenian and to some extent in English from Stichting Vrienden van Keiolonie" , Kotterstraat 22-24, 1794 BE Oosterend-Texel, the Netherlands. Kayenians live on/in Kayolonia. The Dutch title translates as the Friends of Kayolonia Foundation. The quote mark is a dieresis over the final e of Keiolonie". Kayenian is Keiaans(e) in Dutch. I've appended the text of a Kayenian childrens song, transliterated according to Dutch orthographic conventions, with the translation into Dutch and English below: "Bac,ahaasing lustrreu prroek noelkaa Koo luparksee phaumo suptierr Vaadiesi trroek noelkaa tarr sa" ( c, stands for a c-cedille in original transcription as I doubt the cedille would be accepted by UNIX or my mailer) Poesje Proek (die) likt zich schoon, Maar zijn kop (daar) kan hij niet bij. (Dus) dat doet poesje Troek (voor) hem. "Kitten Prook licks himself clean, but he cannot reach his head. So kitten Trook does that for him." Most of the material I have is about Kayenian music and instruments. So, I don't know to what extent the language has been developed. It certainly doesn't look/sound like so many other conlangs based on common latin or international roots. It may be only a coincidence, but the Dutch-Canadian SF writer E. A. Van Vogt invented a mythical country called "Calonia" in one of his early novels. This coincidence makes me wonder if this and similar sounding words (i.e., Kayolonia) sound especially pleasing to Dutch ears? More generally, how highly do the phonetic structures of conlangs correlate with the natlangs of their inventors? -- John Date: Wed, 24 Jun 1992 12:05 EDT From: Ronald Hale-Evans Subject: "thesarus" languages To: conlang@buphy.bu.edu X-Vms-To: CONLANG X-Vms-Cc: EVANS Reminds me of something from a couple of years ago: I was reading _ju'i lobypli_ and saw a letter by a guy, a software engineer who had independently reinvented the a priori, classificatory, "thesaurus"-type conlang in his youth and worked at it feverishly. He gave it up when, one day, while thinking about his language, he looked at a graphite pencil and saw it disassemble itself into its component parts before his eyes--the wood peeled back, revealing the lead, the metal clamp and rubber eraser detached themselves, and so on. I thought this was such a great story that I called him and asked him to contribute an article to the regular column "Notes from the Mountains of Madness" in my zine _Singularity_, which recounts experiences of people who have gone crazy. (He requested a copy of the zine first. I sent it to him and he returned it a couple of weeks later with a curt note. Must've been too weird for him; it had articles on body modification and satire of Christianity... Sigh... :-) Ron H-E Date: 25 Jun 92 09:37:06 EDT From: Don HARLOW <72627.2647@CompuServe.COM> To: Conlang Subject: Re: MCD Esperanto ack To: Conlang >INTERNET:conlang@buphy.bu.edu Dato: 920624 Reply to Rick Harrison's note of two days ago: >PS. You switched from c^ s^ to cx sx convention; why the >typographical change of heart? Don't you wish Zamenhof had gone >ahead and reformed the E-o alphabet when he wanted to, instead of >letting his adherents talk him out of it? Sorry for the midstream switch -- most of the people on esperanto.soc.culture and esperanto@rand.org use so-called "X-convention" for supersigns, which is useful for sorting and also prevents Microsoft Word from making unwarranted assumptions about control characters -- and I just sort of fell into it. Now I've switched back. As far as I've been able to tell, Zamenhof never _wanted_ to reform the Esperanto alphabet -- coming as he did from a non-English-speaking background, he never had that big emotional hangup about supersigned letters. He included alphabet reform in his 1894 project -- but as far as I can tell, that whole thing was largely a ploy to defuse reformism (nobody as smart as Zamenhof could have seriously intended to promote such an all-things-to-all-men project, guaranteed to be pleasing to everybody and therefore pleasing to nobody). In 1906-1907 Zamenhof showed himself _willing_ to introduce some alphabet reform. This was partly in response to a new wave of reformism in France; but it was also a personal sop to his good friend Emil Javal, also an ophthalmologist, who believed that supersigns were bad for the eyes. When Javal was on his deathbed Zamenhof wrote him a letter in un-supersigned Esperanto (H-convention); Javal noted on the letter something to the effect that this was "of personal value only." After the Ido split, Zamenhof did not again raise the question of alphabet reform. ============================================= Don HARLOW Redaktoro Esperanto U.S.A. tel. (1 510) 222 0187 CompuServe [72627,2647] Internet 72627.2647@compuserve.com