<!-- MHonArc v2.4.4 --> <!--X-Subject: [MUD-Dev] bytecode results --> <!--X-From-R13: Quevf Uenl <ptNnzv-pt.UenlEntr.Sqzbagba.OP.QO> --> <!--X-Date: Thu, 18 Jun 1998 21:17:42 -0700 --> <!--X-Message-Id: 199806190411.WAA02456@ami-cg.GraySage.Edmonton.AB.CA --> <!--X-Content-Type: text/plain --> <!--X-Head-End--> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <html> <head> <title>MUD-Dev message, [MUD-Dev] bytecode results</title> <!-- meta name="robots" content="noindex,nofollow" --> <link rev="made" href="mailto:cg#ami-cg,GraySage.Edmonton.AB.CA"> </head> <body background="/backgrounds/paperback.gif" bgcolor="#ffffff" text="#000000" link="#0000FF" alink="#FF0000" vlink="#006000"> <font size="+4" color="#804040"> <strong><em>MUD-Dev<br>mailing list archive</em></strong> </font> <br> [ <a href="../">Other Periods</a> | <a href="../../">Other mailing lists</a> | <a href="/search.php3">Search</a> ] <br clear=all><hr> <!--X-Body-Begin--> <!--X-User-Header--> <!--X-User-Header-End--> <!--X-TopPNI--> Date: [ <a href="msg01136.html">Previous</a> | <a href="msg01138.html">Next</a> ] Thread: [ <a href="msg01145.html">Previous</a> | <a href="msg01116.html">Next</a> ] Index: [ <A HREF="author.html#01137">Author</A> | <A HREF="#01137">Date</A> | <A HREF="thread.html#01137">Thread</A> ] <!--X-TopPNI-End--> <!--X-MsgBody--> <!--X-Subject-Header-Begin--> <H1>[MUD-Dev] bytecode results</H1> <HR> <!--X-Subject-Header-End--> <!--X-Head-of-Message--> <UL> <LI><em>To</em>: <A HREF="mailto:mud-dev#kanga,nu">mud-dev#kanga,nu</A></LI> <LI><em>Subject</em>: [MUD-Dev] bytecode results</LI> <LI><em>From</em>: Chris Gray <<A HREF="mailto:cg#ami-cg,GraySage.Edmonton.AB.CA">cg#ami-cg,GraySage.Edmonton.AB.CA</A>></LI> <LI><em>Date</em>: Thu, 18 Jun 1998 22:11:38 -0600</LI> <LI><em>Reply-To</em>: <A HREF="mailto:mud-dev#kanga,nu">mud-dev#kanga,nu</A></LI> </UL> <!--X-Head-of-Message-End--> <!--X-Head-Body-Sep-Begin--> <HR> <!--X-Head-Body-Sep-End--> <!--X-Body-of-Message--> <PRE> As I promised a month or two ago, here are the results from my experiments with putting a bytecode interpreter into my server. This was mostly written back then, with a few words added now. It is nice to back *using* a computer instead of administering it! This will be part of a document included with my system. This will be a document included with the system, hence the format and the introductory material. Hmm. I'll have to invent a new name for my system, now that I will no longer run on just Amiga's. Any suggestions? Is 'CGMud' too lame? AmigaMUD, Copyright 1998 by Chris Gray The ByteCode system in AmigaMUD What is Bytecode? Bytecode is just a different internal representation for the actions (functions) in the system. It is slightly more compact, but, most importantly, it is easier to interpret, so executes faster. I'm not certain of the name, but it likely comes from the fact that the "opcodes" in bytecode are always exactly one byte long. The bytecode interpreter in AmigaMUD executes instructions for a simple virtual machine, just like a Java virtual machine does. Again, like the Java virtual machine, the AmigaMUD virtual machine is a stack machine, in that it doesn't have any registers (other than the program counter, the stack pointer and a frame pointer), and so pushes and pops things from the stack a lot. The speed increase comes from the simplicity of the decoding of the bytecode (each instruction starts with a one-byte code that completely specifies what it does), and the fact that the bytecode "compiler" can make some optimizations. Why Did I Add the Bytecode system? The non-bytecode interpreter in AmigaMUD is fairly efficient, quite a bit moreso than the systems in many MUD servers. This is mostly due to the strongly typed nature of the AmigaMUD programming language, and to the effort I put into making it fast. However, when I wanted to investigate fractal world generation, I found that it was far too slow in AmigaMUD. So, I thought about ways of speeding it up. I could have made the system generate native 68000 code, since I have experience in 68000 code generation. That would have given me the fastest execution. However, I want my system to be portable to other architectures, and the effort of doing native code generation on all of the various UNIX platforms, as well as for the ubiquitous Intel x86 architecture is very large. Using a bytecode system can keep my system portable, but still get good speedup. Why Didn't I Use Java Bytecode? There are three main reasons for this. I won't try to put them in any order of importance, because I don't know the order: - it was lots of fun to do my own system - Java bytecode is a lot more complicated than mine (I don't have anything like Java's constant table), and has many features that I don't need. - my bytecode can run faster than Java's, mostly because I specifically designed it for fast execution. Also, many of the features that slow down Java are not present in the AmigaMUD programming language, so I could use simpler techniques. Did I Reach My Goals? At first, I was quite disappointed in the speedups I was getting with bytecode. For my first fractal terrain generator, it was difficult to reach even a 3 times speedup over the old interpreter. I had expected much more speedup, and could not explain the small gains. Then I thought about it differently, and thought that perhaps for that code, the original interpreter was faster than I had thought. I went to a native code version of the same program, and discovered that it only ran 6.5 times faster than the bytecode version! That ratio is almost unheard-of for any kind of interpretation, so I had no reason to be dissatisfied. Here is one test program that I used. It is a solver for the "Towers of Hanoi" problem, but without any printouts. Printouts would limit the speed of any version to system imposed speeds, so they have been left out. hanoi: proc, owner SysAdmin, useCount 3: proc hanoi(int fromPeg, toPeg, usingPeg, n)void: if n ~= 0 then hanoi(fromPeg, usingPeg, toPeg, n - 1); hanoi(usingPeg, toPeg, fromPeg, n - 1); fi; corp The timings I got for it are: n Old New Factor Native Factor ----------------------------------- 18 43 7 6.1 0.62 11.2 19 87 15 5.8 1.26 11.9 20 170 29 5.9 2.54 11.4 So, the bytecode interpreter runs about 6 times faster than the old interpreter, and about 12 times slower than native code, for this routine. The absolute times are for a 25 Mhz 68040. The actual fractal terrain generator has somewhat different numbers: 256 x 256 fractal terrain generation: Old New Factor Native Factor -------------------------------- 43 13 3.3 2.0 6.5 Here, the bytecode is 3.3 times faster than the old interpreter, and 6.5 times slower than native code. Why is there such a large difference in the ratios? I believe this is an artifact of the nature of the code, and the architecture of the 68000 family of CPUs, and that the ratios would be quite different for other CPUs. The bytecode machine has 3 address registers (PC, SP, FP, as mentioned earlier), and needs upto 4 or 5 other 32 bit values in some instructions. These fit nicely into the registers available in the 68000 family, and so the bytecode interpreter runs well. The fractal program has a lot of local variables, more than will fit into the registers, so it doesn't run very well (at least with the compiler I used) as native code. So, the slowdown from native to bytecode is not that much. On other CPU architectures, things are different. On the register-poor Intel x86 architecture, the compiler may not be able to keep all of the virtual machine's registers in real registers, and so the bytecode will not be as effective. On RISC CPU's, which typically have 32 registers, all of the local variables for the fractal routine will fit into real registers, and it will run correspondingly faster. In both of those cases, the slowdown from native code to bytecode will be larger than I observed for the 68000 family. Is More Speedup Possible? There are some things that can be done to squeeze a bit more speedup from the bytecode interpreter, but they are probably pretty minimal. One of the things that slows down the bytecode interpreter is the fact that it must fetch bytes of longer values (16 bit or 32 bit) from the instruction stream one byte at a time. This is because those values may not be aligned, and so the interpreter cannot use the 16 or 32 bit load instructions, which require aligned operands on the 68000 family. On CPU's which do not require such alignment, those instructions might be usable, so long as there are no byte-ordering problems (the byte- order for AmigaMUD bytecode is big-endian). Even with endian differences, optimization of the bytecode on first execution can be done, re-arranging the bytes to be in the required order for the native CPU. On a CPU that does not allow unaligned accesses, some of the operands will end up properly aligned, just by chance, and the system could detect those cases and replace the instructions with equivalent ones that just use the 16 or 32 bit fetch. This is possible since, once bytecode is present in the server's memory for a function, it is not moved. A more extreme possiblity is to change from bytecode to wordcode, where the instructions are 32 bit values, and everything ends up being properly aligned. This would greatly increase the size of the code, but might be worthwhile in terms of speedup. With 16 bit or 32 bit instructions, the interpreter would not need to extend the bytes before using them to index into the compiler-generated table of offsets for the case statement which is the core of the engine. Looking at the 68000 assembler code my compiler generated for the bytecode interpreter, there are a couple of instructions, per bytecode instruction executed, that could be removed. With larger opcodes (no longer just a byte), the opcodes could all be multiples of 2 (the likely size of the entries in the compiler-generated jump table), so that the shift generated by the compiler is not needed. Its not clear that a compiler could ever be convinced of this however, since it likely doesn't check to see that all possible indexes are multiples of 2, and likely will also insist on checking for all of the odd values. Since all bytecode is generated by the system, I could assume that only valid bytecodes are present, and not test the range of the bytecodes. Again, this is difficult to convince a compiler of. -- Chris Gray cg#ami-cg,GraySage.Edmonton.AB.CA </PRE> <!--X-Body-of-Message-End--> <!--X-MsgBody-End--> <!--X-Follow-Ups--> <HR> <!--X-Follow-Ups-End--> <!--X-References--> <!--X-References-End--> <!--X-BotPNI--> <UL> <LI>Prev by Date: <STRONG><A HREF="msg01136.html">[MUD-Dev] Re: Prescience Rules?</A></STRONG> </LI> <LI>Next by Date: <STRONG><A HREF="msg01138.html">[MUD-Dev] Re: Analysis and specification - the dirty words o</A></STRONG> </LI> <LI>Prev by thread: <STRONG><A HREF="msg01145.html">[MUD-Dev] Biting the hand that feeds you</A></STRONG> </LI> <LI>Next by thread: <STRONG><A HREF="msg01116.html">[MUD-Dev] MySQL scalability</A></STRONG> </LI> <LI>Index(es): <UL> <LI><A HREF="index.html#01137"><STRONG>Date</STRONG></A></LI> <LI><A HREF="thread.html#01137"><STRONG>Thread</STRONG></A></LI> </UL> </LI> </UL> <!--X-BotPNI-End--> <!--X-User-Footer--> <!--X-User-Footer-End--> <ul><li>Thread context: <BLOCKQUOTE><UL> <LI><STRONG>[MUD-Dev] Re: WIRED: Kilers have more fun</STRONG>, <EM>(continued)</EM> <ul compact> <ul compact> <LI><strong><A NAME="01258" HREF="msg01258.html">[MUD-Dev] Re: WIRED: Kilers have more fun</A></strong>, J C Lawrence <a href="mailto:claw#under,engr.sgi.com">claw#under,engr.sgi.com</a>, Tue 30 Jun 1998, 00:37 GMT </LI> </ul> <LI><strong><A NAME="01167" HREF="msg01167.html">[MUD-Dev] Re: WIRED: Kilers have more fun</A></strong>, Jon A. Lambert <a href="mailto:jlsysinc#ix,netcom.com">jlsysinc#ix,netcom.com</a>, Mon 22 Jun 1998, 23:11 GMT </LI> </ul> </LI> <LI><strong><A NAME="01157" HREF="msg01157.html">[MUD-Dev] RI Adventure Contest (fwd)</A></strong>, Holly Sommer <a href="mailto:hsommer#micro,ti.com">hsommer#micro,ti.com</a>, Mon 22 Jun 1998, 13:32 GMT <LI><strong><A NAME="01145" HREF="msg01145.html">[MUD-Dev] Biting the hand that feeds you</A></strong>, J C Lawrence <a href="mailto:claw#under,engr.sgi.com">claw#under,engr.sgi.com</a>, Fri 19 Jun 1998, 23:10 GMT <LI><strong><A NAME="01137" HREF="msg01137.html">[MUD-Dev] bytecode results</A></strong>, Chris Gray <a href="mailto:cg#ami-cg,GraySage.Edmonton.AB.CA">cg#ami-cg,GraySage.Edmonton.AB.CA</a>, Fri 19 Jun 1998, 04:17 GMT <LI><strong><A NAME="01116" HREF="msg01116.html">[MUD-Dev] MySQL scalability</A></strong>, J C Lawrence <a href="mailto:claw#under,engr.sgi.com">claw#under,engr.sgi.com</a>, Wed 17 Jun 1998, 18:51 GMT <LI><strong><A NAME="01113" HREF="msg01113.html">[MUD-Dev] Prescience Rules? [Previously submitted under wrong thread :( ]</A></strong>, Larry Homer <a href="mailto:afn40452#afn,org">afn40452#afn,org</a>, Wed 17 Jun 1998, 17:56 GMT <UL> <LI><strong><A NAME="01118" HREF="msg01118.html">[MUD-Dev] Re: Prescience Rules? [Previously submitted under wrong thread :( ]</A></strong>, Dan Shiovitz <a href="mailto:dbs#cs,wisc.edu">dbs#cs,wisc.edu</a>, Wed 17 Jun 1998, 19:45 GMT <UL> <LI><strong><A NAME="01120" HREF="msg01120.html">[MUD-Dev] Re: Prescience Rules? [Previously submitted under wrong thread :( ]</A></strong>, Richard Woolcock <a href="mailto:KaVir#dial,pipex.com">KaVir#dial,pipex.com</a>, Wed 17 Jun 1998, 23:41 GMT </LI> </UL> </LI> </UL> </LI> </UL></BLOCKQUOTE> </ul> <hr> <center> [ <a href="../">Other Periods</a> | <a href="../../">Other mailing lists</a> | <a href="/search.php3">Search</a> ] </center> <hr> </body> </html>