Message-ID: <94Mar16.211449pst.58375@mu.parc.xerox.com> Date: Wed, 16 Mar 1994 21:14:38 -0800 From: Pavel Curtis <Pavel@parc.xerox.com> To: MOO-Cows@parc.xerox.com Subject: The 1.8.x planned feature list OK, at long last, here's my list of everything I am consciously planning to do in 1.8.x. Note that `disk-basing' and `multiple inheritance' are not on this list; the former is, as I've said, in the queue for 1.9.0, and the latter is not in the plan at all. There are no release times in this message; I have no estimates available, so don't bother asking. Sorry. The order of this list is almost entirely random and most certainly does not reflect my priorities, which are somewhat poorly worked out right now. The only firm decision in that regard is that the first 1.8.x release will contain the first three tasks listed (cleaned-up internal DB interface, exception-handling, and improvements to hostname lookup); this is because my current sources already contain partial implementations of all of these and I pretty much have to finish implementing them in order to get back to a compilable set of sources... Before anyone else can say it, I'm well aware of just how mind-bogglingly long this list is; fortunately, most of the tasks are pretty small in scope, implementable in no more than a few days of my time. I expect, therefore, to have each succeeding 1.8.x release contain just one or two of the more major pieces and a selection of several of the smaller ones. To help y'all understand which changes I think fall in which categories, I've labelled each change with one of the following three marks: -- a small change involving less than a day of my time ++ a substantial change, but still implementable within a week ** a relatively major change, probably requiring more than a week of my time In all cases, these are only my best guesses and they are intended to include a fair amount of slack, since I rarely get to spend even a majority of any week doing hacking anymore. Getting this list out to y'all is my part of a little bargain I'm hoping to strike. Your part is to let me know, either publicly or privately, which ten or so of these changes are most important to you. I'll take that feedback into consideration as I decide what to work on next at any point. You can also help by reminding me of things I've promised at some point but have not included in the list. You can also, of course, try to convince me to add other things to the list, but since it's so long already, it may be a tough sell. Finally, let me reassure you that I will try to be good about posting a proposed design for each of the more significant changes *before* I implement it, so that y'all can have a voice in the process. This is particularly true for the following changes: exception-handling in-DB command parsing floating-point numbers I hope to send out my current (more-or-less complete) design for the first one of these in the very near future. One advantage of doing so is that I'll then be able to erase a good portion of my whiteboard again... Enough prologue; here's the list. Pavel ** Clean up internal DB interface Currently, the majority of the server code makes a lot of assumptions about the data structures and in-memory nature of the database; one important step on the way to disk-basing is to eliminate those assumptions. I am about 2/3 of the way through converting the server over to a strictly procedural interface to the DB that (hopefully) hides completely the fact that it's still kept in memory; if I do this well, the eventual switch to disk-basing will be much more localized, just covering the specifically DB-oriented modules of the code, a relatively small fraction of the server. ** Exception-handling and -signaling The current `d' bit on verbs is clearly an extremely clumsy, error-prone, and insufficient mechanism for handling the errors raised by various built-in server operations. The new mechanism, to be described fully in an upcoming message, makes it easy for programmers to catch just the errors they are prepared to handle, to raise and catch new kinds of errors specific to their applications, to cope reasonably cleanly with other people's code running out of ticks and/or seconds, and to capture the error traceback in a usable form before it can be displayed to an unwitting user. It may also brew your coffee for you in the morning. ++ multiple-process workaround for hostname lookup problems A lot of people are having a variety of problems with the consequences of the server's current policy of abruptly interrupting the C library functions gethostbyaddr() and gethostbyname() when they take `too long' to translate a numeric IP address to/from a human-readable host name. I have finally worked out a portable mechanism for avoiding having to do that violence, involving the use of multiple extra UNIX processes to handle the lookups. It's pretty damned gross, in my opinion, but it should prove to be a great deal more robust. --------------- Tasks above this line are already in progress. --------------- ++ MOO-code performance profiling This will make it possible for MOO programmers to find out where all of the ticks/seconds are going in their applications, with a breakdown by MOO object/verb and built-in functions. Finally you'll be able to tell whether or not it's $list_utils:assoc that's really slowing you down. It will also be possible for wizards to use this facility to discover what code is actually spamming everyone when the lag is high. ++ MOO-code network servers It will be possible for wizard-blessed code to listen for outside network connections on other TCP ports than the one the server itself is using, making in-MOO Gopher, SMTP, WWW, NNTP, etc. servers easy to write. ++ suspend()/resume(task-id [, value]) It will be possible for a task to suspend `forever' and for the owner of any suspended/forked task to cut short their associated waiting times, making them immediately eligible to run. -- Unprogrammed verbs equivalent to ones with empty programs Right now, unprogrammed verbs are, in most respects, utterly invisible (verb-call can't find them, for example) while verbs with empty programs (i.e., with the empty list of statements), are equivalent to one that simply returns zero. I want to change the treatment of unprogrammed verbs to make them just like empty ones, for the sake of consistency and simplicity of the programmer's model. -- callers([task-id [, include-line-numbers]]) The first argument would allow a wizard or the owner of a task to get ahold of the stack for a suspended task. The second would allow you to get the current line number of all of the frames on the stack, for use perhaps in generating more useful traceback descriptions. ++ New, much faster regexp package The current implementation of the match() built-in function is quite slow and poorly-written. The GNU project has a new implementation of roughly the same functionality that is much faster and more robust. ++ New built-in function `set_connection_user(conn, user)' This would allow for more flexibility in how player objects are associated with network connections. For example, this would allow systems to use read() to prompt for information during the login process, or the construction of a command to allow a user to temporarily act as some other user (such as an administrator temporarily acting as their associated wizard, like in the UNIX `su' command). The design of this facility is not yet set. -- open_network_connection() suspends the calling task This would allow the server to continue doing useful work while waiting for an outbound connection attempt to succeed or fail without, as it currently does, imposing a set time limit on how long that takes. ++ Setting system parameters during server operation This would allow you to change the ticks/seconds limits and other system parameters from MOO code, while the server is running. Some of the new facilities listed below, like making certain functions wiz-only, would also be controlled by this mechanism. ++ Limit on the number of background tasks per user I'd like to make this configurable on a per-user basis so that some folks could be allowed more tasks than others. This would make it easier to keep from losing when somebody blows it in their use of `fork' and tries to create huge gobs of tasks. -- Maybe some facilities for binary I/O, at least on outbound connections This shouldn't be hard, but I haven't decided on a form for the facility to take, or on the permissions checks that should apply. Basically, I just want to enable the use of non-ASCII or non-line-oriented services from inside the MOO. ++ Statements analogous to `break' and `continue' from C I haven't completely decided on the form this will take, but it will probably allow exit/continuation of any surrounding loop, not just the innermost one as in C. ++ Optionally making many of the built-in functions wiz-only This would cover all functions that aren't pure data operations, like length(), index(), etc. Probably it will be possible to make only certain selected functions wiz-only. -- Optionally making many of the built-in properties wiz-only Ditto. Some folks, for example, have expressed interest in keeping .location/.contents secret. ++ Unambiguous (1-based) numeric names for verbs This would finally allow reliable, unambiguous access to any verb on an object, regardless of naming conflicts. Rog posted the idea that I want to implement on this mailing list sometime (late?) last year. -- Optionally disabling numeric strings as special indexical verb names This would allow a complete switchover from the old, bad verb-indexing method to the new, good one. Without disabling this, it could be impossible to refer by name to certain verbs. ++ `$' notation in subscripting brackets I put forward this idea on MOO-Cows last year; the idea in short is that the subscripting brackets `[ ... ]' would allow within them the use of the special expression `$', meaning the length of the value being subscripted. This allows, for example, expressions like `x[2..$]' to get the `rest' of a list after the first element or `x[random($)]' to get a random element of a list. ** Support for in-db parsing This will include built-in function support for efficient emulation of the current built-in parser and close variants. The built-in parser will no longer exist outside of those new built-ins. The design of this feature is not at all complete yet. -- Adding OUTPUTPREFIX and OUTPUTSUFFIX as synonyms for PREFIX and SUFFIX I'm told that this would make the MOO compatible with some other servers, allowing the same somewhat fancy clients to be used with both. ** Support for floating-point numbers This will include simple arithmetic, some number of useful built-in functions (including the most common transcendentals), and formatted conversion of numbers to strings. -- Maybe moving checkpoint scheduling entirely inside the DB This would make the scheduling of checkpoints much more flexible and much easier for the wizards to change as demand warrants. Basically, this would involve simply removing code from the server that currently uses $dump_interval to decide when to make a checkpoint; instead, only the use of the existing dump_database() built-in function would cause a checkpoint to be taken. For those who might guess otherwise, this remains useful even in the eventual disk-based server, which still has something like checkpointing to do in order to make sure that there's a consistent copy of the DB on disk for backup mechanisms to save. -- Prevent multiple verbs on a single object with the same name and args There's no good reason for this sort of thing, it would prevent a certain amount of confusion, and it would mean that name/args would be a unique identifier for a verb on an object. -- Built-in functions computing the size of an object/value in bytes This is obviously useful for a numer of MOOs that are moving to byte-based quota mechanisms; with the speedup possible by coding these primitives in C, it should be possible to keep the quotas much more current. I have considered trying to get the server to do all of the work of byte-based quota management, but it's pretty hard/inefficient to do it much better than what's already done in MOO code. -- queued_tasks([value counts_only [, obj player]]) If COUNTS_ONLY is true, then for normal players only a count of tasks is returned and for wizards an a-list {{player, count}, ...} is returned, giving the counts for every player who has any queued tasks at all. If PLAYER is specified (only a wizard can specify a different player), then only the data for the specified player is returned. This will make it easier for wizards to quickly determine the state of the background task queues in the system. ++ Persistent `handle' on a verb/property across all kinds of changes This is a new kind of MOO value, a `verbdef', which refers to the thing that `add_verb()' creates and that `set_verb_code()', `set_verb_args()', and `set_verb_info()' modify. This is quite useful for a MOO-code browser/editor, which could be guaranteed to keep its hand on the verb that the user specified at the beginning of a session in spite of concurrent changes to its name, args, code, etc. ditto for properties once the property-renaming change is introduced (see below). ++ New MOO value type: tables Tables are like lists except that that map from values to values, rather than just from a small range of numbers to values; this is useful in a wide range of applications where programmers are trying to associate values with arbitrary `keys'. Implemented using hashing and some tricky collaboration with the reference counter, many common styles of use will be efficiently performed using side-effects without ever showing the MOO programmer anything that violates the current model that all values are immutable. -- hash(value) A primitive consistently mapping arbitrary MOO values into quite random integers (or maybe lists of integers, to get enough bits), enabling a variety of applications (like inter-MOO value transfer) where a quick check is needed to determine if two values are the same without having to actually transfer the value first. For the cognoscente, this will probably involve the use of either MD5 or Snefru, two available cryptographically-secure hash functions. -- toascii(char)/tochar(num) Simple functions for converting between characters and numbers, making it easier (for one thing) to cope with binary I/O. ++ foo:bar, call_proc(proc, @args) This makes available both parts of the expression `foo:bar(@args)' as separate pieces: (1) the verb lookup, including the permissions checks, and (2) the actual verb invocation, including the passing of the arguments. One thing this enables is the handing out of the ability to call a particular `!x' verb only to certain trusted parties without requiring hairy permissions checks in the verb itself. Also, and this is perhaps of some use for in-DB command parsing, a wizard could look up a verb with the wizardly permissions and then change to some user's permissions before actually invoking the verb. In conjunction with the `x' bit change below, one could finally make all of the private verbs on an object `!x' and thus avoid the necessity for many in-verb permissions checks, since only trusted folks would be able to call the verb in the first place. -- Regularizing the `x' bit on verbs The `x' bit currently acts quite differently than the `r' or `w' bits on verbs; if `x' is off, then *nobody* can call it from MOO-code, not even the owner or a wizard. I want to change it to work like the other bits, whereby the owner or a wizard could call any verb and everybody else would get E_PERM if `x' is not set. I know that this would break some existing code that counts on the current behavior where a `!x' verb is simply invisible to verb-call's lookup procedure, but I think the number of such cases is pretty small. Feel free to object if you think I'm wrong about that. ++ Binary, compact DB file format Right now, the DB file is written out entirely in printable ASCII, with the code for all verbs written out in source form. This slows down the checkpointing code, makes the file substantially bigger than it might be, and really slows down the DB loader on server startup, since it has to do a lot of relatively expensive parsing. I want to move to a compact binary format, in particular storing verb code in its compiled state. This should help both in speed and space, perhaps reducing both by as much as a third or more. Also, most of the new code for this will be useful in the disk-based system as well, since the DB will be in a binary format then anyway (probably GDBM, for those who care). ++ Server reboot without losing connections This is a clever/gross hack whereby the server (but not the server machine) could be rebooted without kicking off all of the users. This will be more useful in the disk-based server but I think will be sufficiently so even now. The idea is that the server would stop processing commands for a bit while it makes a special checkpoint (which includes *all* of the state of the server, including the pending input/output for all connections and the information about who's attached to what connection), uses the UNIX `exec()' function to start running the new copy of the server without losing all of the network connections, and loads in all of the data stored by that special checkpoint. Since that data includes information about which file descriptors are for which connections, it could re-establish all of the state from the previous server. The upshot is that the users would just see a rather long pause in server response (much shorter for the eventual disk-based server) instead of being booted and having to reconnect later. For many MOOs, this could be done at night or early in the morning without any users noticing. ++ `player' -> `user' I want to remove all uses of the word `player' from the programming language, replacing it in all cases with `user'. This affects a number of built-in function names and the built-in variable `player'. This will be done in such a way as to allow an old DB to be automatically converted during loading. It may also be possible to allow either name to work even after loading, so that code could still be relatively easily be transferred from an older MOO to a newer one. -- $vname(...) This is a new notation equivalent to `#0:vname(...)', by analogy to the current `$' notation for properties on #0. In systems that have made many built-in functions wiz-only, it might be useful to define any publicly-callable versions on #0 so that they could still be called with a concise name. ++ Archwizard's emergency back door This is a facility allowing the archwizard (the one with access to the actual machine on which the server is running) to start up the server in a very special single-user mode where arbitrary expressions and/or commands could be executed as a wizard without necessarily having to go through any verbs in the database. This makes it possible to more-or-less easily recover from nasty mistakes made even in such critical places as #0:do_login_command(), etc. -- Non-suspending read() Currently, the server always suspends a task that calls read(), even if there's already data available for reading on the connection. This makes certain applications that read a bunch on data from a network connection, like MOO-Gopher, pretty painfully slow. In the new server, it will be possible to get the old behavior but also to have such calls to read() return immediately with the read data. For MOO-Gopher and other such applications, this could improve performance by orders of magnitude. ++ Fix 16 v 32 v 64 bit problems? I could go to the effort of making sure that the server always gets the sizes of integers it needs even on machines where C's `int' type is not 32 bits long. It's not entirely clear how important such a change would be, though, since almost all machines do have a 32-bit `int' type and I think all of the others have that as an option. Feel free to let me know if I'm wrong about this *and* it's important to you to have the server run on such a machine. -- Allow renaming of properties Currently, you cannot rename a property the way you can a verb. I can't see any reason for this inconsistency and it certainly has been a pain for a number of people at various times. ++ Verb argument names, checking It has always been a personal embarrassment for me, and a pain for most MOO programmers, that verbs can only get their arguments as one list that they must destructure (and check the length of) explicitly. I have a design for a simple argument name/number specification construct that also allows for optional arguments (with default values) and a list of the `rest' of the arguments; with this, you'd also get automatic checking for the correct number of arguments to a verb and raising of a more evocative error message than `Range error'. ?? General server performance improvements It has been a long time since I did any performance profiling of the server and tried to make it substantially faster. I am guessing that there's at least a factor of two more-or-less readily available and perhaps as much as a factor of ten. Obviously, such an improvement would be welcome in a number of MOOs. It's hard to guess just how much effort is involved here, since I don't know how many relatively simple changes there are that could yield big improvements. ---------------------- End of monstrous list of features ----------------------