08 Feb, 2009, amylase wrote in the 1st comment:
Votes: 0
Hi guys,

Hope I'm not starting some sort of war or debate, that's not the intention. I've been playing with LPMud as a hobby for the past 3 - 4 years. More and more I'm finding the sort of computation I need is too taxing for an interpretor. Hence here I am, hoping to try out ROM or SMAUG or any of Dikumud derivatives (correct me if I'm wrong - ROM and SMAUG are both immediate or subsequent derivatives of original DikuMUD right?). Just looking at the number of posts and replies here, there appears to be more active discussion on ROM than SMAUG.

Basically what I am asking is: which Dikumud derivative is easiest to set up, with neat code, with ongoing updates, good documentations and active discussions happening? I've come across quite a handful of names: CircleMud, Merc, SillyMud, SMAUG, ROM, TBA and a few others. To me they are all MUDs that need compiling and that's as much as I know about each. Which one is good to start on for a novice? (Something that is simple to use, not too messy and not obsolete).

Thanks a lot.
08 Feb, 2009, The_Fury wrote in the 2nd comment:
Votes: 0
Out of your list look at in no particular order TBA(Circlemud) SmaugFUSS(Smaug) and RAM(Rom, if they have released anything yet, Quickmud if they have not). They are all actively developed projects, with communities that should be able to help you if and when things go wrong.
08 Feb, 2009, quixadhal wrote in the 3rd comment:
Votes: 0
Yep, if you need colour support or a more (or less) debugged ROM, Quickmud is probably the better choice. RAM is a slow mover, which is probably quite a bit cleaner to read, but hasn't been stress-tested and is essentially ROM with OLC and some bug fixes right now.

You say you're running into computational issues. Are you taxing the CPU of your host overmuch, or are you running into the limit which most LP drivers place on single execution rounds? Classically, when coding something that takes a large number of "ticks", you have to break up the problem into chunks that can be done under the limit and continued via call_out's.

I ask, because you might be able to refactor your code to avoid that issue, and you might also be able to recompile the driver with a higher limit if you know you're well under the CPU ceiling.
08 Feb, 2009, Scandum wrote in the 4th comment:
Votes: 0
You could give emud a try which is easy to setup, has neat code (if you set your tab width to 5), little documentation, and even less discussion. Then again, the last two goes for most codebases nowadays.

http://www.mudbytes.net/index.php?a=file...
08 Feb, 2009, Kline wrote in the 5th comment:
Votes: 0
Hi, perhaps you can give us all some more specifics about your end goal, so we can better pitch our own favorites and projects? :P

I'm partial to AckFUSS, but probably only because I maintain it :). It's still got active development (me) whenever I get free time (not doing school work). As far as most heavily stress-tested, debugged, and overall "best workhorse", I'd go with the SmaugFUSS guys. They probably have the largest community and have the most developed codebase from where they started.

IE: SmaugFUSS is just an "updated/fixed" version of Smaug, which is closer descended from Diku. RAM is the same concept, but for the ROM base; and AckFUSS is again the same concept, but for the lesser known ACK!MUD codebase.
08 Feb, 2009, Sandi wrote in the 6th comment:
Votes: 0
I think it's fair to say ROM is obsolete. The available version is 2.4b, while the current code is 4.1. The mail list is dead, and few of the experts are still around. There was a time when just being a ROM was enough to get a steady trickle of players, but that's no longer true.


I'll also ask what you're trying to do, and what needs more computation. "The right tool for the job", and all that. A compiler is a clever gimmick, but in the end it's just a gimmick, and a limiting one, at that.
09 Feb, 2009, amylase wrote in the 7th comment:
Votes: 0
Hi guys, thanks a lot for the replies. Put simply I am working on a linguistic bot. I am trying to improve on Eliza by giving her a sense of theme or topic rather than just firing open sentences based on discrete unrelated sentences. My bot will digest dialogues as they roll out, and try to figure out what topic is being talked about. The more sentences there are on the topic, the easier it will be for the bot to find the topic word. Kind of like a guessing game where each sentence tells you something about the same topic. The bot will cross match key words with a dictionary database that contains words, definitions, origins, tenses, synonyms, examples of use etc. Input is dialogue. Output is one or two words describing the topic being talked about.

For example If I feed these sentences into the machine, it should be able to tell me "Zebra" as output:
Input1: I am going to talk about an animal.
Input2: The animal is a mammal.
Input3: This mammal has four legs and a tail.
Input5: This animal has stripes.
Input6: The stripes are black and white.
Input7: The name starts with letter 'Z'.
Input8: You see plenty of these in Africa.
Input9: Any one for a drink tonight?
Input10: Cargo Bar near the waters seems pretty nice.

If the bot is able to run these sentences by its engine against a good dictionary, it should be possible to come up with an accurate topic. Once the topic is known, my bot can actively contribute something relevant to the conversation. When a new topic emerges, the bot should be able to detect the change also (in this case, starting from input 9, clearly a new topic has risen).

I will need to access the dictionary frequently and rapidly. Looking at something like doing calculations on 1000 words per second. I thought an interpretor might be too slow for this kind of computation requirement.
09 Feb, 2009, Sandi wrote in the 8th comment:
Votes: 0
As an aside, reconsider the concept of "too slow". The best mob programmer I ever met, the guy who could convince you the shopkeeper really was a player, did it by duplicating the delay that occurs while someone is typoing their answer. And yes, a rare random typo is the most convincing evidence of all. :)
09 Feb, 2009, David Haley wrote in the 9th comment:
Votes: 0
Many such programs are written in interpreted languages. I wouldn't worry about that. I would worry much more about the very considerable difficulty of implementing a conversation robot that is able to parse the kind of sentences you mentioned. :wink: Do you have a background in NLP by any chance?
09 Feb, 2009, Silenus wrote in the 10th comment:
Votes: 0
I don't have a background in NLP but I have read a couple books on the topic including one which pretty much walks you through developing a NLP system which can respond to queries. Most of the stuff in the book looks doable as a few month long project if you mix in open source code but the issue for me when looking at this was how best to or construct a lexicon for the system. That part to me seems rather non-trivial compared to parsing sentences and mapping them onto first order logic and using a theorem prover to determine the semantic meaning of the sentence.

Component wise it seems like you need something like the following- a natural language parser which maps the natural language into first order logic, a theorem prover and a lexicon. Parsing and translatiing into first order logic isn't too bad and there are books that will walk you through it. Doing theorem proving again isnt too difficult assuming you use an open source one. The labour intensive part is the lexicon. I was unable to find one of these as free ware or open sourced online and I suspect constructing one manually would take a significant number of man years.

As for performance I think it depends on which bits. The parsing doesnt need to be fast but it is quite possible if the lexicon contains alot of information that the theorem prover might need to be.
09 Feb, 2009, Cratylus wrote in the 11th comment:
Votes: 0
My cat's breath smells like cat food.

-Crat
09 Feb, 2009, David Haley wrote in the 12th comment:
Votes: 0
Silenus said:
That part to me seems rather non-trivial compared to parsing sentences and mapping them onto first order logic and using a theorem prover to determine the semantic meaning of the sentence.

I think our trivialometers must be tuned quite differently: the idea of mapping sentences to FOL and using a theorem prover seems "rather non-trivial" to me! This is still an area of very open research, and the results are on toy language subsets or controlled language.
09 Feb, 2009, Silenus wrote in the 13th comment:
Votes: 0
Well it depends on what you mean- implementing a reasonably effective natural language parser (obviously the best parsers still dont cover full natural language- this is still open) isnt too troublesome. Mapping to FOL from some of these reasonably effective subsets onto FOL (though still a topic of research) is still possible to program in a short time frame (assuming you had some system for doing so). The final part however seems non-trivial in the sense that constructing a lexicon I suspect can take on the order of man-years worth of time and you still might not have something that is close to the state of the art. This seems to be extremely labour intensive.

These languages are controlled obviously as are the latest developments in language parsing but I suspect for a mud they may be adequate. You can parse some reasonably complex sentences with them and map them onto first order logic in meaningful ways. As I said I am no expert on this topic but I speculate that for either a command parser or a conversation system which is limited to certain standard sentence structures you can do a pretty good job already- and if you looked at the published literature you could get close to the published state of the art if you so desired.

The lexicon however is a stumbling block for any such system however- I think manually constructing an adequate ontology for comprehension = extraodinarily difficult in terms of labour. As I said this is just based on reading a couple textbooks on the topic and I am definitely no expert in this area).
09 Feb, 2009, quixadhal wrote in the 14th comment:
Votes: 0
Consider that prolog, the grand daddy of this kind of processing languages, is itself an interpreted system. Compilation is not going to gain you very much here. It's all about algorithm choice and text processing power.

If I were an LPMUD programmer, I'd seriously consider the idea of writing the actual logic part of the system in perl and having IT query against an indexed RDBMS, and then use sockets to connect it to your lpc "bot". I know that's how Discworld does their SQL work, as it's simpler than trying to hook the queries directly into the driver, and perl is a good choice for lots of regex crunching.

Of course, if you're using DGD, you may not have the option (as it doesn't normally allow outgoing sockets). LDMUD might have stronger SQL support, but I've never used it.

I just don't see switching to a Diku as any kind of gain for that kind of project. In fact, having to deal with strings in C will probably waste a great deal of your time, right there. :)
09 Feb, 2009, David Haley wrote in the 15th comment:
Votes: 0
Silenus said:
I speculate that for either a command parser or a conversation system which is limited to certain standard sentence structures you can do a pretty good job already

Command parsers are very different from conversation systems – the universe is quite a bit smaller, and you can get away with constraining the language more. If the idea is to make conversation seem "natural", and your mob can only understand carefully crafted sentences, one has to wonder if you achieved your goal.

It so happens that I've looked into this field a little bit… I think you're underestimating the difficulty. Mapping language to FOL is not something I'd call "not too troublesome", nor is it something I'd consider doable in a "short time frame".

In fact, as soon as you start talking about going through published academic literature, and getting to the "state of the art" academically, that alone is a flag to me that this isn't something that's terribly easy to do.

Here's a very simple example of a very "simple" problem: how do you map ambiguous words or sentence structures to FOL? (The problem is simple only in that it comes up very early.)

Basically I don't think somebody should call something easy unless they've really done it and gone through all the issues themselves. There's a big difference between saying "some people" (i.e. academics who've put years of effort into it) have done it, and that anybody can pick up textbooks and replicate it easily and in a short amount of time.
09 Feb, 2009, David Haley wrote in the 16th comment:
Votes: 0
quixadhal said:
Consider that prolog, the grand daddy of this kind of processing languages, is itself an interpreted system.

Well, it's worth noting that a decent (i.e. usable) Prolog implementation is also compiled up to wazoo, to a rather complicated "machine language". It wouldn't surprise me if the Prolog interpreter were quite a bit more efficient than the LPC interpreter.

quixadhal said:
and perl is a good choice for lots of regex crunching

Regexes are not well-suited to this kind of task at all, unfortunately. Regexes get you Eliza, not the kind of intelligent conversation agent we're talking about here. (Of course you could do it, but really, you want a parser, not a pattern matcher.)
09 Feb, 2009, quixadhal wrote in the 17th comment:
Votes: 0
I wouldn't use regexes to analyze the sentance, I'd use it to pre-parse the text input for tokenization. I might do word stemming via Lingua::Stem.... We had some luck with that in the past when doing keyword analysis on news articles.

If you can break the words down to their base forms and possibly also identify their parts of speech, THEN a good RDBMS can help do the heavy lifting. In any case, I think I'd want to use a language that was designed to work with strings, not C.
09 Feb, 2009, Silenus wrote in the 18th comment:
Votes: 0
Quote
Here's a very simple example of a very "simple" problem: how do you map ambiguous words or sentence structures to FOL? (The problem is simple only in that it comes up very early.)


You could be right David- as I indicated I have only read a couple books on the topic and don't claim to be any sort of expert. I think you might have misinterpreted my post some in this case though. I said relatively easier- not easy (hardly). Progress I suspect has been made in this area though I was looking at that Blackburn book and it seems like that building a system which is close to the state of the art (the book is dated 2005) isn't too trickly. As for mapping onto FOL I believe the system does allow for enumerating possible alternative FOL sentences i.e. multiple meanings which cope with obvious ambiguities which occur often in normal language.

Quote
Consider that prolog, the grand daddy of this kind of processing languages, is itself an interpreted system.


Quote
Well, it's worth noting that a decent (i.e. usable) Prolog implementation is also compiled up to wazoo, to a rather complicated "machine language". It wouldn't surprise me if the Prolog interpreter were quite a bit more efficient than the LPC interpreter.


I am not sure if you are referring to the Warren Abstract Machine here- but isn't this sort of like a VM? Or is additional compilation done these days with these systems? But in either case Prolog is probably not a good choice for handling this sort of thing except for with toy example systems. You really need a full theorem prover since Prolog's simplified semantics my understanding is doesnt cope particularly well with the clauses generated by the parser –> FOL layers. Even naively implemented resolution based provers are known to trip up on test cases (since certain proof chains my understanding explode and dont return properly without optimization tricks).
09 Feb, 2009, Silenus wrote in the 19th comment:
Votes: 0
quixadhal said:
I wouldn't use regexes to analyze the sentance, I'd use it to pre-parse the text input for tokenization. I might do word stemming via Lingua::Stem.... We had some luck with that in the past when doing keyword analysis on news articles.

If you can break the words down to their base forms and possibly also identify their parts of speech, THEN a good RDBMS can help do the heavy lifting. In any case, I think I'd want to use a language that was designed to work with strings, not C.


I speculate for most applications currently this would be the correct approach but it falls a bit short of doing actual NLP and thus the techniques required are a bit different (again no expert on this). You would need some sort of parser devised for coping with ambiguities such as maybe Tomita's generalized LR parsing for example. As for representation of what you glean from the sentences you use as input you really do need first order logic I gather which is a superset of Codd's 8 operators for relational algebra- so a RDBMS probably does not cope well with this. Fortunately there are a category of tools which do this job quite well i.e. theorem provers (I think the reigning speed champion is Vampire) which can cope with information coded in this manner.
09 Feb, 2009, aidil wrote in the 20th comment:
Votes: 0
If this were really easy, we'd have had machines passing the turing test already, but for all I know that isn't happening yet except maybe in rather strictly controlled circumstances (which invalidates the test but is a nice sign of progress).

At any rate.. DGD doesn't provide outbound sockets by default indeed, but that is easily solved by using the network package from svn://wotf.org/dgd-devel-net (full disclosure, I'm the current maintainer of that code). Alternatively, the 'perl script' could make an inbound connection to DGD (which is what Dworkin would advice, but more relevant possibly, is what Skotos is doing for their dgd based game engine).

That said, I'm not entirely sure this is needed at all. While I have only implemented the natural language parsing part with a more limited context, I have been looking at how to deal with the somewhat large dataset of a cross referenced lexicon and what could be called 'generic context' (ie, things a human can be expected to know) and how to deal with it.

So far I have investigated adding native support for bdb to DGD, using an external relational database or using LPC objects as data containers (changing a few defines lets dgd scale to 4g objects, provided your lib can deal with that also). So far the later is comming out as the winner for me because of being able to add 'native' logic to items in the lexicon, and the efford required being only a small part of the total work needed for building a usable dataset.

The natural language parsing part could easily be done by dgd's string parser, which is often used for command parsers, but is a generic lr parser that can do many things (the dgd network package contains grammar to use the string parser to decode mudmode data for use with intermud 3 for example, and I have used it for other network protocols as well), and happens to be good at nlp.

Aidil
0.0/31