03 Jan, 2010, Confuto wrote in the 1st comment:
Votes: 0
Suppose we have a MUD written in Python and we want to allow builders to write scripts (also in Python) and attach them to mobs/rooms/etc. Is there a safe way to run these potentially dangerous scripts?

Googling on the subject has led me to believe that eval() is potentially unsafe, but that an alternative might be to use ASTs and walking - unfortunately the Python documentation on this is incomplete, so I don't really understand it.

I was considering maybe writing a lexer/parser to convert a custom, Python-esque scripting language into Python. Would that be overkill? More importantly, would it be safe? Am I right in thinking that that would give me total control over what sort of functions/builtins are permitted? It seems that the AST method is sort of similar to this approach - is the former preferable?

I realise that there's no completely safe way to execute untrusted scripts, I'm really just looking for a way that's safe enough, while still being versatile.
03 Jan, 2010, David Haley wrote in the 2nd comment:
Votes: 0
Take a look at this approach using eval – you can restrict the set of available functions that eval uses, and in particular remove access to globals you don't want the script to use. In general eval can restrict the global and local environments. Obviously you still have to worry about things like infinite loops. I'm not sure which unsafe property of eval is bothering you, so this might or might not answer the question.
03 Jan, 2010, Barm wrote in the 3rd comment:
Votes: 0
In cryptology, there's a saying that goes 'anyone can design a cypher they themselves cannot crack'. It's really, really tough to determine if something is safe from the ingenuity of others.

I'm doing the same thing with my project – allowing designers to store scripts in rooms and items (YAML) that are basically Python with whitespace indention replaced by brackets. I wrote a lexer that breaks everything into tokens and I screen out 47 potentially hazardous keywords like "import" and "open" (including looping). Then compile the snippet in bytecode that I exec using a limited environmental dictionary.

The next phase is to stock the environment with a library of functions their scripts are allowed to call.
04 Jan, 2010, Confuto wrote in the 4th comment:
Votes: 0
I'm not really concerned about things like infinite loops and similar dangers - I figure there are plenty of ways to crash the MUD using even the most basic scripting ('foo' * 100 ** 3). What most worries me is damage to the server running the MUD or to the MUD's data. I guess that means I should be trying to prevent importing of disallowed modules, file I/O and database access.

Something that I'm particularly interested in is finding a way to limit what attributes/methods of a class are available inside the restricted environment. Scripts that are attached to characters, for instance, will have the character object available to them when they are executed. I'd like a way to prohibit the use of, for instance, the load() and dump() methods of any object from inside the restricted environment.

David Haley said:
I'm not sure which unsafe property of eval is bothering you, so this might or might not answer the question.

I suppose I should've expanded on that in the OP. As I said above, I would like a way to filter out the use of particular functions/methods, regardless of whether they are builtins or not. This is why I mentioned lexing and AST walking. That said, I'm not very knowledgeable on these matters and my main concern with eval() is that I've seen people who (seemingly) are knowledgeable on it voice concerns - mainly that it's possible to 'break out' of a restricted environment.

Am I right in thinking that the AST approach sort of 'expands' the entire thing and looks for forbidden builtins everywhere? I.e. if I call character.dump() which calls storage.store() which, in turn, calls open(), would that fail using an AST?

Barm said:
In cryptology, there's a saying that goes 'anyone can design a cypher they themselves cannot crack'. It's really, really tough to determine if something is safe from the ingenuity of others.

I'm doing the same thing with my project – allowing designers to store scripts in rooms and items (YAML) that are basically Python with whitespace indention replaced by brackets. I wrote a lexer that breaks everything into tokens and I screen out 47 potentially hazardous keywords like "import" and "open" (including looping). Then compile the snippet in bytecode that I exec using a limited environmental dictionary.

The next phase is to stock the environment with a library of functions their scripts are allowed to call.

Did you find that particularly taxing? Is there some way to allow restricted access to particular builtins (import, for instance) when using a lexer? I suppose you'd just let it through the lexing stage then override the builtin import when building your environmental dictionary.
04 Jan, 2010, David Haley wrote in the 5th comment:
Votes: 0
Confuto said:
Scripts that are attached to characters, for instance, will have the character object available to them when they are executed. I'd like a way to prohibit the use of, for instance, the load() and dump() methods of any object from inside the restricted environment.

I don't think you can easily do this short of going to an awful lot of trouble with the lexer/AST method. I would recommend that you pass not the actual character object, but a wrapper character object, to these 'untrusted' scripts. Then you can give your wrapper only those methods that you want it to have (which it defers to the underlying object).

For example, in pseudo-code:
class CharacterWrapper(object):
def __init__(self, character):
self._character = character

def make_deferer(method_name):
def deferer(obj, *args, **kwargs):
return getattr(obj, method_name)(obj, *args, **kwargs)
return deferer
for method_name in ['get_hp', 'get_mana', 'do_damage', …]:
setattr(CharacterWrapper, method_name, make_deferer(method_name))


The above is completely untested and likely has some issues, but that's the general idea.

Then you would never make the full character object available, but only the wrapper – and in particular the wrapper won't have any unsafe methods associated with it.

Confuto said:
That said, I'm not very knowledgeable on these matters and my main concern with eval() is that I've seen people who (seemingly) are knowledgeable on it voice concerns - mainly that it's possible to 'break out' of a restricted environment.

I haven't looked into Python sandboxing enough to know exactly why this is, but if you have concrete examples of "breaking out" I would like to see them. It seems to me that if Python lets you specify the global dictionary and the local dictionary, then if you were able to access a symbol not reachable from those dictionaries, there is a bug in Python somewhere. (Of course, you must very carefully construct your dictionaries so that the only reachable symbols are good ones.)

Confuto said:
Am I right in thinking that the AST approach sort of 'expands' the entire thing and looks for forbidden builtins everywhere? I.e. if I call character.dump() which calls storage.store() which, in turn, calls open(), would that fail using an AST?

I guess. This whole approach rings all kinds of warning bells for me and strikes me as serious overkill. Barm's approach is kind of a compromise.

I'm not sure how much background you have with ASTs so until further discussing it could you maybe give your comfort level with them? Using an AST correctly will require some background in compilers so I hesitate quite a bit to recommend it as a solution.
04 Jan, 2010, elanthis wrote in the 6th comment:
Votes: 0
Never filter OUT restricted methods. Instead, use a whitelist. If you try to come up with a list of banned functions, you will miss some. Or a new Python release will add new ones. And then your security is shot.

Unfortunately, Python just flat out sucks in this regard. It has absolutely no usable sandboxing mechanism, which most other dynamic languages have provided for years (Ruby, JavaScript, Lua, etc.).

If your goal is merely to protect the host machine itself, then do not try to protect the MUD from within itself – protect it from the host. That is, run the MUD in its own jail and minimal environment. (chroot jails on Linux, BSD jails, whatever.) That will give you everything you're looking for. Just make sure that your source control repository is hosted outside of the jail and accessible via password-protected network connection (svn over https, git over ssh, whatever) so that a potentially malicious builder cannot permanently hose all your work.

You should trust your builders. Don't give people privileges if you aren't reasonably sure they deserve them. Same goes for coders.

If you REALLY need to allow untrusted builders to work on your code (which is silly, since they could just ruin the MUD in a million ways), you are probably out of luck with Python. The work-arounds to get fake sandboxes in place are either massively complex (on the order of writing an entire new scripting engine… in a scripting engine) or full of holes (lexers with blacklists).
04 Jan, 2010, Confuto wrote in the 7th comment:
Votes: 0
David Haley said:
I don't think you can easily do this short of going to an awful lot of trouble with the lexer/AST method. I would recommend that you pass not the actual character object, but a wrapper character object, to these 'untrusted' scripts. Then you can give your wrapper only those methods that you want it to have (which it defers to the underlying object).

[…]

Then you would never make the full character object available, but only the wrapper – and in particular the wrapper won't have any unsafe methods associated with it.

That's a neat idea, thanks.

David Haley said:
I haven't looked into Python sandboxing enough to know exactly why this is, but if you have concrete examples of "breaking out" I would like to see them. It seems to me that if Python lets you specify the global dictionary and the local dictionary, then if you were able to access a symbol not reachable from those dictionaries, there is a bug in Python somewhere. (Of course, you must very carefully construct your dictionaries so that the only reachable symbols are good ones.)

Here's an example. I think it's mainly an issue with the (im)proper construction of the restricted environment - thing is, of course, that the more you want to allow in eval(), the harder it becomes to keep the environment safe.

David Haley said:
I'm not sure how much background you have with ASTs so until further discussing it could you maybe give your comfort level with them? Using an AST correctly will require some background in compilers so I hesitate quite a bit to recommend it as a solution.

Yeah, I don't know anything about ASTs or compilers, and certainly won't be using that method. It seems far too complex for what I want to do.

elanthis said:
Never filter OUT restricted methods. Instead, use a whitelist. If you try to come up with a list of banned functions, you will miss some. Or a new Python release will add new ones. And then your security is shot.

I agree with this whole-heartedly.

elanthis said:
If your goal is merely to protect the host machine itself, then do not try to protect the MUD from within itself – protect it from the host. That is, run the MUD in its own jail and minimal environment. (chroot jails on Linux, BSD jails, whatever.) That will give you everything you're looking for. Just make sure that your source control repository is hosted outside of the jail and accessible via password-protected network connection (svn over https, git over ssh, whatever) so that a potentially malicious builder cannot permanently hose all your work.

I thought of this, but it seems to be a rather big solution to a fairly small problem.

elanthis said:
You should trust your builders. Don't give people privileges if you aren't reasonably sure they deserve them. Same goes for coders.

This is, of course, true, but giving out unrestricted (or poorly restricted) scripting access to builders seems just as silly as giving them shell access. It's easier to trust people if you're not giving them that much power, after all. :wink:
04 Jan, 2010, Twisol wrote in the 8th comment:
Votes: 0
@Confuto
To the jail thing: I don't think that's a "big solution", it's a "common sense solution". >_>
To the scripting access thing: Then Python might not be the right tool for the job. I think I heard somewhere, anyways, that it only obfuscates private member names rather than actually hiding them? Even if that's not true… from what I'm hearing here, another language would be much simpler to use for sandboxing. You could roll your own, too - plenty have - and that would give you total control, but that's a pretty complex undertaking.
04 Jan, 2010, Confuto wrote in the 9th comment:
Votes: 0
Twisol said:
@Confuto
To the jail thing: I don't think that's a "big solution", it's a "common sense solution". >_>

Perhaps 'big' was the wrong word. I would prefer a solution that's within the MUD itself. After all, what happens if it's being run on windows? If (for whatever reason) whoever's running it is unable to put it in a jail? If they're too lazy to put it in a jail?

Twisol said:
To the scripting access thing: Then Python might not be the right tool for the job. I think I heard somewhere, anyways, that it only obfuscates private member names rather than actually hiding them? Even if that's not true… from what I'm hearing here, another language would be much simpler to use for sandboxing. You could roll your own, too - plenty have - and that would give you total control, but that's a pretty complex undertaking.

Given that the rest of the MUD is written in Python, it made sense to me to put scripting in Python (or a subset thereof). Embedding another scripting language in Python is probably beyond my expertise and, quite frankly, seems a little weird.

But then again, there's always LOLCODE.
04 Jan, 2010, donky wrote in the 10th comment:
Votes: 0
elanthis said:
Unfortunately, Python just flat out sucks in this regard. It has absolutely no usable sandboxing mechanism, which most other dynamic languages have provided for years (Ruby, JavaScript, Lua, etc.).


It is all very well to mention that Ruby, Javascript, Lua and other languages have usable sandboxes, but without proof, you are propagating debatable claims to the possible detriment of others. The last time I encountered a discussion about Python not having a secure sandbox, there were also claims about other languages (Ruby for instance) having one. However, a quick googling found web sites (belonging to people who are lofted high as being authoritative on the relevant domain on Reddit at least) disproving the claim at least for Ruby.

A secure sandboxed Python is entirely possible through PyPy. Whether it can sandbox isolated code within itself, or can have that functionality added within it, is not something I know about.

I agree with you fully that whitelisting is the way to go. The only way I would seriously consider allowing sandboxed third-party code to be run within my Python MUD library is at the expression level.
04 Jan, 2010, Barm wrote in the 11th comment:
Votes: 0
Confuto said:
Did you find that particularly taxing? Is there some way to allow restricted access to particular builtins (import, for instance) when using a lexer? I suppose you'd just let it through the lexing stage then override the builtin import when building your environmental dictionary.


My lexer uses regular expressions to find tokens. Restricted keywords are searched for third in line after comments and strings. This is the restricted list I'm using;
## The following keywords are not permitted within scripts.  Expand as needed.
_RESTRICTED = [
'Exception', '__builtin__', '__class__', '__debug__', '__dict__',
'__init__', '__local__', '__subclasses__', 'as', 'assert', 'break',
'class', 'compile', 'continue', 'def', 'del', 'delattr', 'dict',
'dir', 'eval', 'except', 'exec', 'execfile', 'exit', 'file', 'finally',
'for', 'from', 'getattr', 'global', 'import', 'input', 'lambda', 'locals',
'object', 'open', 'property', 'raise', 'raw_input', 'reload', 'return',
'setattr', 'staticmethod', 'try', 'while', 'with', 'yield',
]

If a script contains any of those it's rejected. Elanthis is correct that filtering out is not ideal and does not offer protection against language updates or hidden exploits. As David said, it's a compromise. My design calls for a LOT of tiny scripts compiled to bytecode and stored in dictionaries. This should stop 99% of mischief but Admins should eyeball every script just to be sure.
04 Jan, 2010, David Haley wrote in the 12th comment:
Votes: 0
Confuto said:
Here's an example. I think it's mainly an issue with the (im)proper construction of the restricted environment - thing is, of course, that the more you want to allow in eval(), the harder it becomes to keep the environment safe.

Oh, that's pretty interesting. I do sandboxing quite often in Lua, where things work quite differently. The exploits mentioned in that page have mostly to do with being able to access objects and using their methods to do fancy things (the trick of getting a number, then going to its class's superclass, and going to its subclasses, was cute), and seems to rest on the assumption that Python has a lot of statements that are always available (like '__import__').

You are completely correct that the more power you want in the eval environment, the more trouble you are being exposed to. :wink:

Confuto said:
This is, of course, true, but giving out unrestricted (or poorly restricted) scripting access to builders seems just as silly as giving them shell access. It's easier to trust people if you're not giving them that much power, after all. :wink:

Well, here's the thing. Presumably, your builders aren't trying to really hack things. And even if they are, they'll have to do some pretty weird things to get around a decent sandbox (just look at the examples on that page). If you require that all non-trusted builders have their scripts reviewed by somebody who knows Python, you'll be able to see pretty easily if they're doing something funky.

Embedding Lua into Python is probably not hugely difficult depending on what exactly you want to do with it. Even so I would use this as a 'last resort' if it is not acceptable to you to review builder scripts, because overlapping languages is always a source of confusion and maintenance difficulty.

Basically, you need to balance your security and trust requirements against being able to actually get your game running. A lot of security issues can be "solved" (for some definition of "solved", of course) with social or policy fixes rather than technical fixes.
05 Jan, 2010, donky wrote in the 13th comment:
Votes: 0
For some reason tinypy came to mind. It turns out there was a project to add a sandbox mode to it, as part of Google summer of code. Of course, as an open source project, documentation is lacking so it is yet to be revealed how suitable it is.

This Stack Overflow page also points out PyMite. And it highlights that it is easily restricted.

So there are at least two unexplored restricted execution options. Both I imagine more accessible than PyPy, which in my experience takes quite a bit of work to get into.
0.0/13