09 Jun, 2009, David Haley wrote in the 141st comment:
Votes: 0
That's my best guess as well, but I don't like analyzing algorithms with just guesses, especially when there's contention in the air…
Anyhow my preference would be to use more standard terminology (i.e. a map keyed by coordinate pair implemented as a hash table) so that we can get straight to the more interesting stuff.
09 Jun, 2009, Runter wrote in the 142nd comment:
Votes: 0
David Haley said:
That's my best guess as well, but I don't like analyzing algorithms with just guesses, especially when there's contention in the air…
Anyhow my preference would be to use more standard terminology (i.e. a map keyed by coordinate pair implemented as a hash table) so that we can get straight to the more interesting stuff.


Agreed. :P
09 Jun, 2009, quixadhal wrote in the 143rd comment:
Votes: 0
There's also another way to do a wilderness that doesn't involve any stored data… generate it using a method which (given the same seed value) will always yield the same results. There are several method out there, Perlin noise comes to mind, to make a reasonable psuedo-random wilderness that can be regenerated at a point. If you want to display a map, you can call your generation function on a range and only create the single room you are in (but show X rooms around if you like).

The tradeoff is CPU vs. Memory. In theory, you could have an infinite sized world this way. It wouldn't be a very interesting world, unless you also provide a varying ecology and npc spawns that change as well…. Walking a million rooms and seeing totally different terrain and npc types that don't occur around the civilized areas might be fun.
09 Jun, 2009, David Haley wrote in the 144th comment:
Votes: 0
Quote
If you want to display a map, you can call your generation function on a range and only create the single room you are in (but show X rooms around if you like).

No, for this to work, you would have to generate every room always in the exact same order – in other words, to guarantee that the random generator produces the same values, you need to make sure that not only do you start with the same seed, but also you generate the rooms in a deterministic order. So, just running on some range would violate that, and you would not necessarily get the same things as if you started from a different range.

It's easy to see this if you consider a one-dimensional space with, say, 20 rooms. If you use seed X and generate rooms 5-15, and then use seed X and generate rooms 6-16, you can easily see that room 7 will be (or more strictly speaking, can be) different in the two generations.
09 Jun, 2009, Runter wrote in the 145th comment:
Votes: 0
quixadhal said:
There's also another way to do a wilderness that doesn't involve any stored data… generate it using a method which (given the same seed value) will always yield the same results. There are several method out there, Perlin noise comes to mind, to make a reasonable psuedo-random wilderness that can be regenerated at a point. If you want to display a map, you can call your generation function on a range and only create the single room you are in (but show X rooms around if you like).

The tradeoff is CPU vs. Memory. In theory, you could have an infinite sized world this way. It wouldn't be a very interesting world, unless you also provide a varying ecology and npc spawns that change as well…. Walking a million rooms and seeing totally different terrain and npc types that don't occur around the civilized areas might be fun.

An interesting idea really.
09 Jun, 2009, Scandum wrote in the 146th comment:
Votes: 0
Runter said:
I'm guessing it's a hash table with 2 keys to the bucket. Should be perfect distribution.

Pretty much. I guess 'pointer grid' is the best way to describe it since it implicates the ability to point to a default room (behaving like a hash table) or a custom room (behaving like a classic grid). With rooms becoming tiles if you want a 4M+ wilderness.

The way I see it a location would be a coordinate, with a tile having multiple locations, and a traditional Diku room having 1 location. A tile room would be a tile that also functions as a container.

quixadhal said:
There's also another way to do a wilderness that doesn't involve any stored data… generate it using a method which (given the same seed value) will always yield the same results.

This is common for generating dungeons, using the player vnum as a random seed so the player can exit and re-enter and have the same dungeon layout, though everything it killed will have been reset.

For a wilderness you could use a compression algorithm, so if you have a 10.000x10.000 grid you could have an array of 10.000 indexes with index 40 (for example) containing from left to right: 2000 ocean rooms, 2 beach rooms, 6 plain rooms, 44 forest rooms, 96 plain rooms, 2 beach rooms, 7866 ocean rooms. Then it shouldn't be too cpu intensive to find what sector is located at 7832,40. Most wildernesses I've seen compress fairly well in this fashion.
09 Jun, 2009, Runter wrote in the 147th comment:
Votes: 0
Scandum said:
Runter said:
I'm guessing it's a hash table with 2 keys to the bucket. Should be perfect distribution.

Pretty much. I guess 'pointer grid' is the best way to describe it since it implicates the ability to point to a default room (behaving like a hash table) or a custom room (behaving like a classic grid). With rooms becoming tiles if you want a 4M+ wilderness.


Or reference grid. :P
09 Jun, 2009, David Haley wrote in the 148th comment:
Votes: 0
Frankly and truly no offense intended, I think that "pointer grid" or, in fact, anything with the word "grid" in it is actually a very misleading way of describing it, since you don't actually have a literal grid of pointers. If the word "grid" must absolutely be used, the word "sparse" should be put in front of it. The bare term "pointer grid" is extremely easy to misinterpret, since the literal meaning is a grid (i.e. 2d array) of pointers.
Hash tables don't have the behavior of pointing to some default thing. A hash table either maps a key to an element or doesn't. Saying that a hash table defaults to something is not standard behavior.
09 Jun, 2009, Scandum wrote in the 149th comment:
Votes: 0
David Haley said:
Hash tables don't have the behavior of pointing to some default thing. A hash table either maps a key to an element or doesn't. Saying that a hash table defaults to something is not standard behavior.

Maybe dummy/hashed room/tile is a better term?

David Haley said:
Frankly and truly no offense intended, I think that "pointer grid" or, in fact, anything with the word "grid" in it is actually a very misleading way of describing it, since you don't actually have a literal grid of pointers

In what I describe you do. You have a 2d array composed of pointers that point to a dummy room, when a player enters - the dummy room is replaced with a real room. In a room based implementation you'd have 1 dummy room for each sector, so if you have 20 sectors and a 1000x1000 grid you'd have 1.000.000 pointers pointing to 20 different dummy rooms. This is typical behavior for a hash table, though not used in the way that a hash table is typically used.

Technically the player could move into the actual dummy room, this would mean that the player would be in thousands of locations at once, hence the need to duplicate the dummy room into a real room to make for a unique location.

A grid implementation also says nothing about the underlying data type, it's well possible to implement a grid as a 1 dimensional array. So I think 'pointer grid' is correct, and 'hashed pointer grid' would imply that a form of hashing is used to conserve memory, with 'hashed room pointer grid', and 'hashed tile pointer grid' and 'hashed tile-room pointer grid' being even more specific. The classic wilderness implementation could be called an 'index grid' or 'sector index grid' with KaVir's implementation being a 'tile index grid' or more specific a 'sector-tile index grid'.

:lol:
09 Jun, 2009, David Haley wrote in the 150th comment:
Votes: 0
Quote
In what I describe you do. You have a 2d array composed of pointers that point to a dummy room, when a player enters - the dummy room is replaced with a real room. In a room based implementation you'd have 1 dummy room for each sector, so if you have 20 sectors and a 1000x1000 grid you'd have 1.000.000 pointers pointing to 20 different dummy rooms. This is typical behavior for a hash table, though not used in the way that a hash table is typically used.

I guess your hash tables work extremely differently from any other hash table I've seen before…

Anyhow, if you don't think your terminology is confusing, that's fine, although you said earlier that your "own terminology is far from solid" so I'm not sure what you meant anymore. I'm just trying to point out why I think there was severe miscommunication here: do with it what you will.
09 Jun, 2009, Runter wrote in the 151st comment:
Votes: 0
Quote
so if you have 20 sectors and a 1000x1000 grid you'd have 1.000.000 pointers pointing to 20 different dummy rooms.


So am I understanding correctly that if we wanted a world of 10000x10000 you'd have 100,000,000 pointers?


I'm not trying to flame or anything. I'm just kinda confused. :P
09 Jun, 2009, David Haley wrote in the 152nd comment:
Votes: 0
The real take-away here is that for this kind of thing, if you want to get truly big, you need to start thinking about sparse representations, not something that instantiates a bunch of stuff, unless what you're instantiating is truly tiny. Basically, you only need to actually represent "interesting" things, where the definition of "interesting" depends (of course) on what exactly you're doing.
09 Jun, 2009, KaVir wrote in the 153rd comment:
Votes: 0
David Haley said:
Several people have mentioned terminology confusion, so maybe there should be a collective effort to stop making assumptions about things and be more clear.


Okay, let me try and explain this more thoroughly. The default world is an array of 512x512 tile IDs. A tile ID is a number in the range 0-255, and is used as an index in a lookup table like this:

#define MAX_TILE 256

const unsigned char WorldTileTable[MAX_TILE][100] =
{
{
/* Number: 0/00
* Description: Solid sea water
* Quarters1: SS
* Quarters2: SS
* River: NONE
* No: 0
*/
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0
},

(snip)

{
/* Number: 10/0a
* Description: River Through Plain
* Quarters1: PP
* Quarters2: PP
* River: E2W
* No: 103
*/
10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 28, 28, 10, 10,
28, 28, 10, 10, 10, 28, 9, 6, 28, 28,
6, 6, 28, 28, 28, 9, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 28, 28, 8, 6,
28, 28, 8, 6, 6, 28, 10, 10, 28, 28,
10, 10, 28, 28, 28, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10
},

(and so on)


Therefore the default world uses 256K of RAM, and the tile table uses 25K of RAM.

As you can see, the tiles in the above table each contain 10x10 terrain types. Therefore, by extension, the default world contains the data for 5120x5120 terrain types.

The terrain type represents the terrain at normal ground level, but it can also be used to determine the terrain at higher and lower points - for example a "west-flowing river" terrain type always has 4 levels of "west-flowing underwater current" below it, followed by 15 levels of "rock". And it always has 20 levels of "air" above it. The same is true of every terrain type - each contains the information for 40 height levels of terrain.

Thus the default world actually contains the data for 5120x5120x40 pieces of terrain, each of which is used to generate 1 room. That's 1048576000 rooms.

Thus to recap, we now have a world representing over a billion rooms which uses 281K, not including the virtual rooms (of which we need at least one per active player).

However we also want to modify the world, and that requires custom tiles. So we have another array, 512x512 again, but this time of pointers to custom tiles. This array therefore uses 1MB of RAM (512x512x4). No, I wasn't using a 64bit processor 12 years ago. Pointers were 4 bytes.

Each custom tile contains a 100x100 array, but this time of 32bit integers (this includes the terrain type, but also contains other data such as walls and doors). Each custom tile also contains four pointers, as each tile is stored in two double linked lists, one representing height, the other representing a list of all custom tiles in memory. This list is capped at whatever number I like - let's say 1000, as each player will be accessing no more than 4 at a time. That's nearly 102K. If it we need more custom tiles after that, we overwrite the oldest one and move it to the front. Note that the custom tiles are saved each time they're modified, so we don't lose any data.

The total memory overhead is now a little under 1.4MB, not including the virtual rooms - but even with several hundred of those, the total memory usage is going to be well under 2MB.
09 Jun, 2009, Runter wrote in the 154th comment:
Votes: 0
KaVir said:
David Haley said:
Several people have mentioned terminology confusion, so maybe there should be a collective effort to stop making assumptions about things and be more clear.


Okay, let me try and explain this more thoroughly. The default world is an array of 512x512 tile IDs. A tile ID is a number in the range 0-255, and is used as an index in a lookup table like this:

#define MAX_TILE 256

const unsigned char WorldTileTable[MAX_TILE][100] =
{
{
(and so on)


Therefore the default world uses 256K of RAM, and the tile table uses 25K of RAM.

As you can see, the tiles in the above table each contain 10x10 terrain types. Therefore, by extension, the default world contains the data for 5120x5120 terrain types.

The terrain type represents the terrain at normal ground level, but it can also be used to determine the terrain at higher and lower points - for example a "west-flowing river" terrain type always has 4 levels of "west-flowing underwater current" below it, followed by 15 levels of "rock". And it always has 20 levels of "air" above it. The same is true of every terrain type - each contains the information for 40 height levels of terrain.

Thus the default world actually contains the data for 5120x5120x40 pieces of terrain, each of which is used to generate 1 room. That's 1048576000 rooms.

Thus to recap, we now have a world representing over a billion rooms which uses 281K, not including the virtual rooms (of which we need at least one per active player).

However we also want to modify the world, and that requires custom tiles. So we have another array, 512x512 again, but this time of pointers to custom tiles. This array therefore uses 1MB of RAM (512x512x4). No, I wasn't using a 64bit processor 12 years ago. Pointers were 4 bytes.

Each custom tile contains a 100x100 array, but this time of 32bit integers (this includes the terrain type, but also contains other data such as walls and doors). Each custom tile also contains four pointers, as each tile is stored in two double linked lists, one representing height, the other representing a list of all custom tiles in memory. This list is capped at whatever number I like - let's say 1000, as each player will be accessing no more than 4 at a time. That's nearly 102K. If it we need more custom tiles after that, we overwrite the oldest one and move it to the front. Note that the custom tiles are saved each time they're modified, so we don't lose any data.

The total memory overhead is now a little under 1.4MB, not including the virtual rooms - but even with several hundred of those, the total memory usage is going to be well under 2MB.



Sooooo, you call that a hash grid or what? I some of the terminology was David's issue. Not any implementation specifics.
09 Jun, 2009, David Haley wrote in the 155th comment:
Votes: 0
Thank you for the detail, KaVir – it helps clear things up a lot.

KaVir said:
Thus to recap, we now have a world representing over a billion rooms which uses 281K, not including the virtual rooms

What is the difference between the first kind of room that represent physical points, and the virtual rooms of which you have at least one per player? Are virtual rooms the ones that store things? In which case, what is the difference between a "canonical room", and the room that has things in it?

KaVir said:
(…custom tiles…)

When you made the default tiles, how did you pick what the defaults would be? Do you draw out the whole world, and then infer the most common tiles? (That's actually an interesting idea…) Or did you just guess as to what the common ones would be, and from there make customizations?



EDIT: to address Runter's post: Scandum's terminology confused me more, what confused me with KaVir's posts were snippets of implementation knowledge but without really laying out the whole picture, so I wasn't sure what the constraints or design goals were. In other words KaVir's usage of low-level data structures basically always makes sense to me, it's the concepts built on top of those data structures (like template tiles, etc.) that I wasn't sure of. (E.g., the above distinction between "room" and "virtual room", "location" vs "tile" vs "room", etc.)
09 Jun, 2009, Runter wrote in the 156th comment:
Votes: 0
Do you use the templates for these for anything other than sector/terrain types? Out of curiosity.
09 Jun, 2009, KaVir wrote in the 157th comment:
Votes: 0
David Haley said:
What is the difference between the first kind of room that represent physical points, and the virtual rooms of which you have at least one per player? Are virtual rooms the ones that store things?

Yes, the virtual rooms are standard Diku/Merc room structures, as this was implemented for a Merc 2.1 derived mud, and rooms were an integral part of combat, movement and communication. The virtual rooms had no real identity of their own though - everything about them was set when you moved into them, based on the tile data.

Example:

I move north. This increments my Y position by 1, so that I'm now at position 104/128. I iterate through the vnums allocated for virtual rooms checking to see if there's one with its position set to 104/128 (a truly horrible solution in retrospect, but please remember this was 12 years ago). There's no virtual room for that position yet, so I need to create a new one - fortunately my loop noted an empty room near the start of the list, so I'm going to use that one.

I now look at the custom world table element [10][12] and it's NULL, so I look at the default world element [10][12] and check the terrain element [4][8]. It's forest. I set the terrain type of my virtual room to forest, set its x/y position to 104/128, and move my character into it.

Another player at 103/128 decides to move east. He iterates through the list of vnums allocated for virtual rooms and finds there's already one at 104/128 - my room. He moves into it. We can now talk and fight using the standard Diku mechanics.

David Haley said:
When you made the default tiles, how did you pick what the defaults would be? Do you draw out the whole world, and then infer the most common tiles? (That's actually an interesting idea…) Or did you just guess as to what the common ones would be, and from there make customizations?


The latter. The tiles were created as generic building blocks. After that, the map was created using the tiles. I frequently wished there were more tiles to cover certain situations, and when I reused the tile data in GW2 I did indeed extend it, so that more tiles could be added as needed.

Runter said:
Do you use the templates for these for anything other than sector/terrain types? Out of curiosity.

The predefined tiles contain only terrain type, although this isn't the same as Diku sector types, as it includes things like "west-flowing river", "knee-deep ocean during high tide", etc. The custom tiles consist of 32bit integers however, and use the extra bits to store walls, doors, and other things that I can't remember offhand. I recall that 1 of the bits indicated that the bits allocated for terrain type (and some of the other bits) should instead be used as the vnum of a regular room that should be used instead of a virtual room.
10 Jun, 2009, Scandum wrote in the 158th comment:
Votes: 0
KaVir said:
Each custom tile contains a 100x100 array, but this time of 32bit integers (this includes the terrain type, but also contains other data such as walls and doors).

You mean 10x10 rather than 100x100 right?


Runter said:
Sooooo, you call that a hash grid or what? I some of the terminology was David's issue. Not any implementation specifics.

I'd call it a hashed tile index grid (512x512x1) combined with a tile pointer grid (512x512x4). It could have been implemented as a single hashed tile pointer grid with multiple pointers to the same custom tile.

If you take that one step further you can create a virtual tile whenever a player enters an unused tile, allowing each virtual tile to also contain a double linked list of virtual rooms, and once all virtual rooms within a virtual tile have been destroyed you can hash the virtual tile and destroy it. It's much like OLC on Merc, to modify a hashed room description you first unhash the string, change your local copy, then hash it again.
11 Jun, 2009, Runter wrote in the 159th comment:
Votes: 0
Scandum said:
Runter said:
Sooooo, you call that a hash grid or what? I some of the terminology was David's issue. Not any implementation specifics.

I'd call it a hashed tile index grid (512x512x1) combined with a tile pointer grid (512x512x4). It could have been implemented as a single hashed tile pointer grid with multiple pointers to the same custom tile.

If you take that one step further you can create a virtual tile whenever a player enters an unused tile, allowing each virtual tile to also contain a double linked list of virtual rooms, and once all virtual rooms within a virtual tile have been destroyed you can hash the virtual tile and destroy it. It's much like OLC on Merc, to modify a hashed room description you first unhash the string, change your local copy, then hash it again.


I'm curious of why you call it hashed since it baresbears no resemblance of what a hashed data-type would function like.

Quote
A hash function is any well-defined procedure or mathematical function which converts a large, possibly variable-sized amount of data into a small datum, usually a single integer that may serve as an index into an array.

I think most (all) people associate it with this type of data storage: Something that requires deriving a key to cut down on search time within individual buckets. It could be argued that the bucket is the tile on the array and the keys are the x/y, but I'm just not seeing it. With that definition any 2d array to find any type of data with a list member would be a hash. I think this is what DH's (albeit friendly) beef was. I'm not quite so friendly of a guy, but that's okay. Neither are you. So I'll just say it. I think this is misinformation of terminology at its best that I'm glad to now dispel.
11 Jun, 2009, David Haley wrote in the 160th comment:
Votes: 0
Runter said:
With that definition any 2d array to find any type of data with a list member would be a hash.

Welllll technically it could be, although it would be a fairly terrible hash table, losing basically all benefits of hashing. :tongue:
140.0/181