15 Mar, 2015, Davenge wrote in the 1st comment:
Votes: 0
Just curious with increase in general bandwidth access these days.
15 Mar, 2015, Kardon wrote in the 2nd comment:
Votes: 0
No, not really. If anyone wants to argue that it is still relevant, then I would point them to Minecraft's network code and how horrible it is, yet it still works smoothly enough. Text-based games aren't going to saturate your bandwidth at all, compression or not.
16 Mar, 2015, quixadhal wrote in the 3rd comment:
Votes: 0
Assume your average screen size is 132x25. That's 3300 characters. Round up to 4K for ANSI sequences and whatever. That's a good upper bound on the amount of bandwidth a text game will use with full text scrolling at you. Assuming you had that much text flying at you ever second, you'd be using 32Kbit of bandwidth per second (4K screen every second * 8 bits).

So, let's assume you run a wildly popular (by today's standards) MUD like BatMUD or Discworld. You have 100+ players online all the time, and we'll pretend that half of them are not AFK and actually playing the game. 50 players using 32Kbit would total up to about 1.6Mbit of bandwidth usage.

Now, I don't know what anyone else's internet connection is like, but my own connection costs me $40/month and has a cap of 3Mbit for the upstream side. So, were I to host a wildly popular MUD on my home machine, it would cost me approximately $20/month to do so.

What level of compression does MCCP offer? I assume it uses gzip, which tends to compress plain text at around 80%. So, while this doesn't matter in the slightest bit for any individual player, it would cut my bandwidth use (at the server) from 1.6Mbit (or 50% saturation) to 320Kbit (or 10% saturation), which would lower the effective cost from $20/month to $4/month.

Note that these are somewhat optimistically high assumptions. *IF* your MUD is popular *AND* it has enough combat spam or other output to generate that much text *AND* you have a typical connection where the upstream limit is pretty low, MCCP can save you a pretty good amount.

Would it be worth developing such a thing from scratch? No. Would it be worth spending a few minutes to download already-existing code and patching it into your game server or client? Maybe.
16 Mar, 2015, Idealiad wrote in the 4th comment:
Votes: 0
It used to be relevant for people playing on their mobiles. However data plans (and wireless access point availability) are getting so much better these days I'm not sure that still holds.
16 Mar, 2015, plamzi wrote in the 5th comment:
Votes: 0
I would argue that it's still relevant, even for MUDs with less than 100 concurrent players (of which I can think of exactly one). More relevant if you have players on mobile, and more so if you're sending out a good amount of out-of-band info.

It's not just about client or server bandwidth, both of which are bound to become less and less problematic. It's more about minimizing the effects of network latency and tapping into the client to do some lifting for you. In other words, some of the same reasons video compression is often not only relevant, but crucial. Not everyone lives next door to your server, and even folks with no bandwidth limits will have wildly variable throughput during the course of a normal day.

On the other hand, I don't think it's a matter of "a few minutes" to patch in MCCP. If you're looking at code not specifically written for your codebase, it will take some tinkering to get it working right. It's not rocket science (the science was taken care of by the folks who wrote the algorithms), but it's also nothing like turning on GZIP on your webserver.
16 Mar, 2015, KaVir wrote in the 6th comment:
Votes: 0
From a post I made back in August 2012:

Quote
3GB? Lucky you, I'm limited to 200MB!

On another thread I mentioned that a random active player on my mud downloaded 167794 bytes (compressed from 1134333) over 2 hours, 15 minutes and 7 seconds, which works out at an average of around 20.7 bytes per second (compressed from 139.9 per second).

At that rate, with compression, it would take me 2814.6 hours (117.3 days) to hit my bandwidth limit. Without compression, it would take 416.3 hours (17.3 days).

So if I only used my phone for mudding, it seems unlikely I'd hit my limit, even without compression. But that's not all I use my phone for, and the convenience of a mobile device means I can leave myself logged on in a semi-idle state for extended periods of time if I wish. If the mud is eating a large percentage of my available bandwidth, then it becomes considerably less convenient.

Of course that was 2 years ago - my phone company now offers me a "generous" 1 GB monthly limit. Without compression, the MUD could still potentially eat a noticeable chunk of my bandwidth, but it's becoming less of an issue.
17 Mar, 2015, Runter wrote in the 7th comment:
Votes: 0
The answer is it's not important for a vast majority of owners.

I wouldn't waste 2 minutes thinking about implementing this until it's a real problem for me. If it's not already for you then I suspect your bandwidth grant will grow at a rate that makes it unlikely for it ever to become a real problem.

Unless you just want to tinker. Which is great.
17 Mar, 2015, SlySven wrote in the 8th comment:
Votes: 0
Ah, does packet fragmentation have any connection to this? I mean how much overhead does the compression have on the individual "lumps" of data that gets sent between the Server and a Connected client such that if a piece of information has to pass from one to the other, will MCCP help if it keeps the data in one packet rather than it being split between two?

I or some other poor sodintrepid coder is going to have to go over Mudlet's Telnet code and rebuild it to handle Utf-8 streams wrapped up in the Telnet protocols - this is made interesting because unlike plain ASCII the data that makes up the text stream is NOT single bytes and if there isn't enough space in a packet it is quite possible for the data that makes up a single grapheme (sort of: a visible "character") to get divided between packets, so extra stuff to track the current "state" will be required. That being the case, if compression reduces the chance of such fragmentation it lessens the chance that the extra code/state-machine that parses such data-streams is going to be exercised so much!

Just my amateur coder's two pence worth.
17 Mar, 2015, plamzi wrote in the 9th comment:
Votes: 0
SlySven said:
if compression reduces the chance of such fragmentation it lessens the chance that the extra code/state-machine that parses such data-streams is going to be exercised so much!


Without any in-depth TCP/IP knowledge, I would say that logically speaking compression should reduce the chance of fragmentation, but at a rate lesser than the compression rate. That's because it seems to me most data bursts in a MUD are likely to fit in one packet. I have seen fragmentation in MUD traffic, however, because in reality congestion on both the server and client side is quite possible at any given moment.

If you're worried about CPU overhead on the client side (really, you shouldn't be), I would guess the savings from the two-byte split checks are bound to be a lot smaller than the CPU cycles spent on decompression. So you should not be adding compression solely for that purpose. But since your client already has MCCP support, then yes, it should help at least a tiny bit in that respect.

And when I say tiny, it's probably going to be very tiny. As in, almost not worth talking about, except maybe in a MUD dev forum.
18 Mar, 2015, SlySven wrote in the 10th comment:
Votes: 0
plamzi said:
…I would guess the savings from the two-byte split checks are bound to be a lot smaller…
Um, some Utf-8 code-points take up to 4 bytes, and when you get into normalization after decomposition (as a MUD client will have to when comparing strings entered from a keyboard by a user against what the MUD server sends - or visa-versa) and there is a Unicode expressed "Safe Text" maximum of 30 consecutive code-points in up to around 128-bytes that should be examined before you can be certain that you have a complete representation of a portion of a text stream (see section 13) here.
18 Mar, 2015, plamzi wrote in the 11th comment:
Votes: 0
SlySven said:
plamzi said:
…I would guess the savings from the two-byte split checks are bound to be a lot smaller…
Um, some Utf-8 code-points take up to 4 bytes, and when you get into normalization after decomposition (as a MUD client will have to when comparing strings entered from a keyboard by a user against what the MUD server sends - or visa-versa) and there is a Unicode expressed "Safe Text" maximum of 30 consecutive code-points in up to around 128-bytes that should be examined before you can be certain that you have a complete representation of a portion of a text stream (see section 13) here.


We're veering here, but I stand by my original assessment. Think about how often an event like a "not fragmented" frame will (not) happen vs. an event like decompressing every single incoming byte. Remember that once you enable UTF-8 support, you will anyway have to buffer at the end of each incoming data burst.

Speaking of which, your client supports a number of protocols with their own sequences, so isn't it smart-buffering already and running a number of checks for completeness, starting from the most basic like ANSI control codes? And for stuff like MSDP and GMCP, aren't you already having to look ahead any number of characters to see if the closing sequence has already arrived?

If you're thinking of allowing UTF-8 only for connections with MCCP on, I feel that would just be a case of shooting an arrow through your own knee. I'm with those who believe the coding world has seen more damage from premature optimization than from lack thereof. You're writing in what is today a low-level language, and even smartphones nowadays breeze through stuff like that.
20 Mar, 2015, Tijer wrote in the 12th comment:
Votes: 0
Mudlet also parses MCCP incorrectly.. id like to point out!! EVERY single mud i have played using Mudlet ive had to turn off compression….
20 Mar, 2015, Scandum wrote in the 13th comment:
Votes: 0
plamzi said:
Without any in-depth TCP/IP knowledge, I would say that logically speaking compression should reduce the chance of fragmentation, but at a rate lesser than the compression rate. That's because it seems to me most data bursts in a MUD are likely to fit in one packet. I have seen fragmentation in MUD traffic, however, because in reality congestion on both the server and client side is quite possible at any given moment.

MCCP (zlib) handles broken packets, so it effectively eliminates unexpected packet fragmentation. Smaller packets also travel faster.
20 Mar, 2015, Tyche wrote in the 14th comment:
Votes: 0
SlySven said:
Um, some Utf-8 code-points take up to 4 bytes, and when you get into normalization after decomposition (as a MUD client will have to when comparing strings entered from a keyboard by a user against what the MUD server sends - or visa-versa) and there is a Unicode expressed "Safe Text" maximum of 30 consecutive code-points in up to around 128-bytes that should be examined before you can be certain that you have a complete representation of a portion of a text stream (see section 13) here.

I haven't found a regex library that compares for characters for equivalence in that manner, so I ignore it.
Besides we're not going to send it, are we? And players are going to attempt to match what is sent.
21 Mar, 2015, dracmas wrote in the 15th comment:
Votes: 0
I didn't want to add a new topic, but only had a quick question. Added a snippet for tbamud to try out MCCP, but can't tell if anything is different when trying it out.
How do you tell the difference when you have MCCP on compared to when it's not there at all?
21 Mar, 2015, plamzi wrote in the 16th comment:
Votes: 0
Scandum said:
MCCP (zlib) handles broken packets, so it effectively eliminates unexpected packet fragmentation.


Care to elaborate on this part?

Having patched in a pretty standard zlib implementation into a Dikurivative, I did not see any magic in there that would be able to do that. Just like socket polling itself, what I get out of the zlib buffer is a continuous stream, with no divine knowledge of whether a bit of info I'm waiting for has arrived or not.

Keep in mind that what SlySven is concerned about is not low-level TCP packet fragmentation, but specifically whether all bytes in a UTF-8 character sequence have arrived. Other than compressing the data into smaller packets that are less likely to get fragmented or delayed at the TCP packet level, MCCP will bring no magic solution to his problem.

The following thread confirms what I've always thought was the case. At the application level, TCP data is a stream (and zlib is a stream on top of that stream):
http://stackoverflow.com/questions/75676...
22 Mar, 2015, SlySven wrote in the 17th comment:
Votes: 0
palmzi said:
Keep in mind that what SlySven is concerned about is not low-level TCP packet fragmentation, but specifically whether all bytes in a UTF-8 character sequence have arrived. Other than compressing the data into smaller packets that are less likely to get fragmented or delayed at the TCP packet level, MCCP will bring no magic solution to his problem.


Yeah, pamzi that's exactly what I was nattering about, and I don't know to what extent MCCP is a technological solution to that {or magic - I recall what Arthur C. Clarke said on the difference…!} I guess everyone's mileage varies.

Tyche said:
I haven't found a regex library that compares for characters for equivalence in that manner, so I ignore it.
Besides we're not going to send it, are we? And players are going to attempt to match what is sent.
Ah, but how will they know what is sent, take a trivially small example, is "" the same as "a?" and for a really evil twisted example how about these four some of which use "Aphabetic Presentation Forms" and depending on your browser, fonts and phases of the moon may appear the same or may not: "office", "o?ce", "o?ice", "of?ce". While the latter case is a bit contrived, the former is likely to be down to which country the MUD or player is in and their keyboard/OS!

For the record all of those cases use different byte sequences but as to whether they are to be treated or "matched" by a regexp system is more complicated! In detail:
""
u+00E5 LATIN SMALL LETTER A WITH RING ABOVE (unicode code points)
0xC3 0xA5 (UTF-8 bytes)

"a?"
u+0061 LATIN SMALL LETTER A; u+030A COMBINING RING ABOVE
0x61 0xCC 0x8A

"office"
u+006F LATIN SMALL LETTER O; u+0066 LATIN SMALL LETTER F; u+0066 LATIN SMALL LETTER F; u+0069 LATIN SMALL LETTER I; u+0063 LATIN SMALL LETTER C; u+0065 LATIN SMALL LETTER E
0x6F 0x66 0x66 0x69 0x63 0x65

"o?ce"
u+006F LATIN SMALL LETTER O; u+FB03 LATIN SMALL LIGATURE FFI; u+0063 LATIN SMALL LETTER C; u+0065 LATIN SMALL LETTER E
0x6F 0xEF 0xAC 0x83 0x63 0x65

"o?ice"
u+006F LATIN SMALL LETTER O; u+FB00 LATIN SMALL LIGATURE FF; u+0069 LATIN SMALL LETTER I; u+0063 LATIN SMALL LETTER C; u+0065 LATIN SMALL LETTER E
0x6F 0xEF 0xAC 0x80 0x69 0x63 0x65

"of?ce"
u+006F LATIN SMALL LETTER O; u+0066 LATIN SMALL LETTER F; u+FB01 LATIN SMALL LIGATURE FI; u+0063 LATIN SMALL LETTER C; u+0065 LATIN SMALL LETTER E
0x6F 0xEF 0xAC 0x81 0x63 0x65
Admittedly I am digressing a bit there, but the Mud Client program must handle such sequences and if they are split between packets the coding must handle it - it is just that if the packets are small enough this may not seen and I guess MCCP might stave off such fragmentation until packets get big enough to bite the bad coder (or the users of the code) in the backside…!
22 Mar, 2015, alteraeon wrote in the 18th comment:
Votes: 0
The fragmentation issue you are talking about is a non-problem, and MCCP isn't a fix for it. Your socket stack should treat the data as a stream, where bytes may or may not arrive in any given processing pass. If you're not handling it this way, you're doing it wrong.

Regarding unicode handling, our custom client has a several thousand character 'stomp table' which smacks all the stupid glyphs from the input window down to something appropriate, or nukes them if they're too weird or unknown. Doing some kind of conversion down to 7 bit ascii was the only way to get consistency and allow copy/paste from browser windows to work properly. For something like general client regex matching, I would recommend a similar approach - sanitize the incoming utf-8 source and search strings and convert them to a sane and proper ascii, then run an ordinary regex against the conversions. If your users specifically need a utf-8 grade regex for some reason, perhaps provide it as an obscure feature.

Alter Aeon MUD
http://www.alteraeon.com
23 Mar, 2015, quixadhal wrote in the 19th comment:
Votes: 0
Or, you could code in a modern language with unicode strings as the default, rather than a hacked-addon like C's character arrays. :)

I believe a regexp in python will properly match unicode, with the understanding that you have to use the POSIX unicode tokens. IE: if you're using unicode strings, \w will match a "glyph" that's considered a character, but [a-zA-Z] won't because that's the ASCII alphabet.
23 Mar, 2015, Pymeus wrote in the 20th comment:
Votes: 0
Most "serious" languages that can handle Unicode strings at all should be able to do Unicode regex. At the very least there's almost always a PCRE library, even in C, which can handle UTF-8.

I think the main issue is sorting out character equivalencies. I like alteraeon's "stomp table", although creating it would be fairly tedious.
0.0/31