19 Dec, 2008, Banner wrote in the 21st comment:
Votes: 0
Yeah, it is a core, and I can't recreate it because it only happens once in a blue moon. I can go kill every bounty mob over and over and it won't happen, and then a few days/week later, it'll happen again.
19 Dec, 2008, Davion wrote in the 22nd comment:
Votes: 0
Well, here's something to help track it down, and at the very least grab some more info on it… In do_hitall right below
if( is_same_group( ch, vch ) || !is_legal_kill( ch, vch ) || !can_see( ch, vch ) || is_safe( ch, vch )
|| !IS_NPC( vch ) )
continue;


add this
if(!vch->in_room)
{ char buf[MSL];
sprintf( buf, "do_hitall - (%d)%s in (%d)%s, is in a room… but not!", victim->pIndexData->vnum, victim->name, ch->in_room->vnum, ch->in_room->name );
log_string_plus( buf, LOG_NORMAL, sysdata.log_level );
continue;
}


Assuming it only happens from within do_hitall… That's the best I can offer ATM. That'll atleast give you a record, and skip the mob. You may want to extract the mob unconditionally under these circumstances though (not sure how to do that in smaug.) And of course, if that's not how you get a vnum (via the pIndexData structure) please replace. Smaug is alien to me. Just kinda sniping from what I see in your posts.
19 Dec, 2008, Banner wrote in the 23rd comment:
Votes: 0
It doesn't only happen in hitall. I've seen it happen from kill as well, so would it be safer to move that up a level to raw kill, maybe?
19 Dec, 2008, Davion wrote in the 24th comment:
Votes: 0
*nodnod* Can't hurt. Considering the only time a mob should have raw_kill called on them (of course, make sure that that code is called after the victim is confirmed to be an NPC) is when they're found in a room. Howerver, if they think they aren't in the room then we have our problem.
19 Dec, 2008, Banner wrote in the 25th comment:
Votes: 0
And how how can the mob be being removed from the room and not showing up in the stack? How could you reproduce that and with this information, when it happens again, how can that help fix it?
19 Dec, 2008, Banner wrote in the 26th comment:
Votes: 0
I added this right above the AFK check:

if(IS_NPC(victim) && !victim->in_room)
{
char buf[MSL];
sprintf( buf, "do_hitall - (%d)%s in (%d)%s, is in a room… but not!", victim->pIndexData->vnum, victim->name, ch->in_room->vnum, ch->in_room->name );
log_string_plus( buf, LOG_NORMAL, sysdata.log_level );
extract_char( victim, TRUE );
ch_printf( ch, "&RA minor error involving the mob you just killed has occured. It has been extracted. Please report this error to the administration.\n\r" );
return;
}
19 Dec, 2008, Davion wrote in the 27th comment:
Votes: 0
Banner said:
And how how can the mob be being removed from the room and not showing up in the stack? How could you reproduce that and with this information, when it happens again, how can that help fix it?


Well, see, that's the problem. It's very weird. Usually when ever a mob's in_room is set to NULL, they're also removed from the list. I'm sure it's something obscure and not something obvious like (mob->in_room = NULL). Maybe somewhere it's setting a mobs in_room with a function call and not checking the return value? It's really, really hard to tell. The best way is to attempt to find a way to reproduce the crash. -Something- is happening wrong, somewhere, and it does need to be tracked down. Watch your logs and see if you notice a pattern occurring. Remember, it isn't random, it just seems like it because you don't have all the information. You even may want to remove the 'continue' call and just let it log, and have it crash. That way you can get the gdb core again and examine it as we've done here to try to figure out a pattern.
20 Dec, 2008, Banner wrote in the 28th comment:
Votes: 0
[shoie13@harbinger log]$ grep "but not" *
1299.log:Fri Dec 19 16:02:52 2008 :: do_hitall - (408)human police officer cop in (310)A City Street, is in a room… but not!
1300.log:Fri Dec 19 21:50:39 2008 :: do_hitall - (325)Toodan in (301)&YCapital Ave.&B-&CBefore Menari Spaceport, is in a room… but not!
1301.log:Fri Dec 19 21:59:44 2008 :: do_hitall - (52)patrol soldier guard in (310)A City Street, is in a room… but not!


This seems interesting. I've also got another gdb crash core from the Toodan entry. Looks exactly the same as the first.
Program terminated with signal 6, Aborted.
#0 0x0000003b492305c5 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x0000003b492305c5 in raise () from /lib64/libc.so.6
#1 0x0000003b49232070 in abort () from /lib64/libc.so.6
#2 0x00000000004bcbd0 in SegVio () at comm.c:431
#3 <signal handler called>
#4 0x000000000054f5e8 in mprog_driver (com_list=0xb192d0 "mpmload 329\n\rmpforce mobslave mpoload 10212\n\rmpforce mobslave drop all\n\rmptransfer mobslave 7\n\r",
mob=0x1b102a0, actor=0x1a81870, obj=0x0, vo=0x0, single_step=0 '\0') at mud_prog.c:1516
#5 0x00000000005512af in mprog_percent_check (mob=0x1b102a0, actor=0x1a81870, obj=0x0, vo=0x0, type=16) at mud_prog.c:2250
#6 0x00000000005518b7 in mprog_death_trigger (killer=0x1a81870, mob=0x1b102a0) at mud_prog.c:2423
#7 0x00000000004ee0b4 in raw_kill (ch=0x1a81870, victim=0x1b102a0) at fight.c:2306
#8 0x00000000004ecf6e in damage (ch=0x1a81870, victim=0x1b102a0, dam=3845, dt=1003) at fight.c:1847
#9 0x00000000004eabfb in one_hit (ch=0x1a81870, victim=0x1b102a0, dt=1003) at fight.c:1196
#10 0x00000000004b58b8 in do_hitall (ch=0x1a81870, argument=0x7fff71378596 "") at combat.c:719
#11 0x000000000057d23a in check_skill (ch=0x1a81870, command=0x7fff71378060 "hitall", argument=0x7fff71378596 "") at skills.c:395
#12 0x0000000000519942 in interpret (ch=0x1a81870, argument=0x7fff71378596 "") at interp.c:376
#13 0x00000000004bdd1d in game_loop () at comm.c:788
#14 0x00000000004bc92e in main (argc=2, argv=0x7fff71378ac8) at comm.c:302
(gdb)


Was actually the same mob and same player(which is not always the case).
20 Dec, 2008, Scandum wrote in the 29th comment:
Votes: 0
#   percent = IS_NPC( ch ) ? 80 : ch->pcdata->learned[gsn_hitall];
# for( vch = ch->in_room->first_person; vch; vch = vch_next )
# {
# vch_next = vch->next_in_room;
# if( is_same_group( ch, vch ) || !is_legal_kill( ch, vch ) || !can_see( ch, vch ) || is_safe( ch, vch )
# || !IS_NPC( vch ) )
# continue;


The problem might be with vch_next. If vch_next dies (on some muds charmed mobs die when the master dies) you'll still end up hitting it, or what's left of it.
20 Dec, 2008, Banner wrote in the 30th comment:
Votes: 0
Scandum said:
#   percent = IS_NPC( ch ) ? 80 : ch->pcdata->learned[gsn_hitall];
# for( vch = ch->in_room->first_person; vch; vch = vch_next )
# {
# vch_next = vch->next_in_room;
# if( is_same_group( ch, vch ) || !is_legal_kill( ch, vch ) || !can_see( ch, vch ) || is_safe( ch, vch )
# || !IS_NPC( vch ) )
# continue;


The problem might be with vch_next. If vch_next dies (on some muds charmed mobs die when the master dies) you'll still end up hitting it, or what's left of it.

How might one go about fixing that? And on that note, I'm starting to receive a lot of bugs from hitall now.

[shoie13@harbinger log]$ grep do_hitall *
1299.log:Fri Dec 19 16:02:52 2008 :: do_hitall - (408)human police officer cop in (310)A City Street, is in a room… but not!
1300.log:Fri Dec 19 21:50:39 2008 :: do_hitall - (325)Toodan in (301)&YCapital Ave.&B-&CBefore Menari Spaceport, is in a room… but not!
1301.log:Fri Dec 19 21:59:44 2008 :: do_hitall - (52)patrol soldier guard in (310)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:22:17 2008 :: do_hitall - (52)patrol soldier guard in (301)&YCapital Ave.&B-&CBefore Menari Spaceport, is in a room… but not!
1301.log:Fri Dec 19 23:22:31 2008 :: do_hitall - (52)patrol soldier guard in (305)&YCapital Ave., is in a room… but not!
1301.log:Fri Dec 19 23:22:34 2008 :: do_hitall - (408)human police officer cop in (305)&YCapital Ave., is in a room… but not!
1301.log:Fri Dec 19 23:29:08 2008 :: do_hitall - (407)rodian in (394)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:29:08 2008 :: do_hitall - (407)rodian in (394)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:29:08 2008 :: do_hitall - (407)rodian in (394)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:29:18 2008 :: do_hitall - (407)rodian in (382)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:29:18 2008 :: do_hitall - (407)rodian in (382)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:29:18 2008 :: do_hitall - (407)rodian in (382)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:29:18 2008 :: do_hitall - (407)rodian in (382)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:30:15 2008 :: do_hitall - (407)rodian in (394)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:30:19 2008 :: do_hitall - (408)human police officer cop in (394)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:31:22 2008 :: do_hitall - (407)rodian in (386)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:31:31 2008 :: do_hitall - (408)human police officer cop in (395)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:31:50 2008 :: do_hitall - (407)rodian in (386)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:32:07 2008 :: do_hitall - (407)rodian in (386)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:32:12 2008 :: do_hitall - (407)rodian in (383)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:32:17 2008 :: do_hitall - (407)rodian in (389)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:32:17 2008 :: do_hitall - (407)rodian in (389)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:32:21 2008 :: do_hitall - (404)male prostitute young man ymp in (387)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:32:21 2008 :: do_hitall - (407)rodian in (387)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:32:21 2008 :: do_hitall - (407)rodian in (387)A City Street, is in a room… but not!
1301.log:Fri Dec 19 23:32:27 2008 :: do_hitall - (407)rodian in (378)A City Street, is in a room… but not!
[shoie13@harbinger log]$
20 Dec, 2008, Igabod wrote in the 31st comment:
Votes: 0
It'll give you that bug report if a player typed hitall and before it actually got to that mob it left the room. so that rodian was probably the 5th mob in the room and was in the process of leaving the room whenever the player decided to go all postal and shoot up the place. I'd call that a lucky rodian.
20 Dec, 2008, Guest wrote in the 32nd comment:
Votes: 0
This has all the hallmarks of a typical Smaug type problem. A big loop like do_hitall runs through the 10 occupants. One by one, they're damaged. One dies. Death routines are called. During the execution of that death routine, the death_prog stuff triggers. Acting on the same list of data as do_hitall. During *THAT* loop, the next mob that would be in line is killed and removed from the room, and then from the occupants list. Processing returns to do_hitall, only to find that the once valid vch_next is now NULL because the nested loop removed the next mob in line. Result: BLAM.

This type of thing is not easy to fix. I've got stuff in AFKMud that to this day will trip itself up at random for having nested loops just like this. I never really found a solution that works. There was some stuff added to SmaugFUSS 1.8 that is meant to help with situations like this, but the code is pretty convoluted and I don't really understand how it works. But that might be the best thing to look into. Figuring out how to adapt it to the SWR code and go from there. Debugging your way around this as-is will prove quite difficult.
20 Dec, 2008, David Haley wrote in the 33rd comment:
Votes: 0
A relatively easy solution is to store not pointers but some other identifier, and have a map that goes from said identifier to the real data structure. Upon death, you remove the identifier from the map. Then your iteration can simply notice that and move to the next element in the list. Of course this assumes that the list is separate from the things in the list, which is not the case in SMAUGish data structures (where the list nodes are a part of the things in the list).
20 Dec, 2008, Tyche wrote in the 34th comment:
Votes: 0
I have seen this bug before in Mobprogs. What happens is the player/mob kills the player/mob immediately following it in the room's link list of players.

for( vch = ch->in_room->first_person; vch; vch = vch_next )
{
vch_next = vch->next_in_room;
….[here]…


The above guards against vch being killed/extracted during [here], however what if vch->next_in_room is killed/extracted?
Then vch_next points to a mob/player that has already been killed/extracted.
20 Dec, 2008, Scandum wrote in the 35th comment:
Votes: 0
One solution is using a global vch_next which you update to ->next or next_in_room if it's found in the extract_char routine, but that's only really useful in char_update, violence_update, and other main loops.
20 Dec, 2008, Lobotomy wrote in the 36th comment:
Votes: 0
I would suggest looking into copying/ripping the LIST code from SocketMUD, as it's a rather nifty method for handling lists that averts said problems with members of lists suddenly being removed. It's essentially a struct that contains a double-linked list of cells, where each cell is a struct that has a void pointer to some specific data. The key is that each cell tracks when it gets invalidated so if you have situations where loops are nested the entire structure doesn't fail; invalidated cells are simply skipped.

It's likely a little slower and uses more RAM than the regular double-linked lists used in things like Smaug, but I'd say the safety it provides is more than worth it. The only thing to bear in mind with the list code from stock SocketMUD is that it doesn't contain a method for reverse list iteration, inserting before or after an element, or fully checking the integrity of a list's links, so they'd be things you would need to create yourself if you actually want/need them.
20 Dec, 2008, David Haley wrote in the 37th comment:
Votes: 0
If compiling with g++, even if not using C++ much at all, I would suggest just using the std::list or std::vector classes. You get type safety from the templates, and a very good data structure implementation as well.
20 Dec, 2008, Skol wrote in the 38th comment:
Votes: 0
I've used Tyche's method above (vch_next = vch->next; etc) for years and it's eliminated those random 'wtf' moments.
20 Dec, 2008, David Haley wrote in the 39th comment:
Votes: 0
What Tyche posted doesn't work in cases like this, as Tyche said. :wink:

(The problem with that vch_next is invalidated as well as vch, so that you can't use vch_next->next)
20 Dec, 2008, Guest wrote in the 40th comment:
Votes: 0
DavidHaley said:
If compiling with g++, even if not using C++ much at all, I would suggest just using the std::list or std::vector classes. You get type safety from the templates, and a very good data structure implementation as well.


Would using std::list prevent the situation as listed here? With the player killing vch_next and having him dropped from the room?
20.0/67