Issue with continous stop and start of mhvtl

classic Classic list List threaded Threaded
Locked 1 message Options
Reply | Threaded
Open this post in threaded view
|

Issue with continous stop and start of mhvtl

[nia]
Administrator

Issue with continous stop and start of mhvtl

Postby raviheg » Mon Sep 13, 2010 6:42 am

For testing purpose I need to emulate different librararies and stand alone tape drives. And for this I need to start and stop mhvtl frequently. 
And I am ending up with kernel error. I am using mhvtl-0.18-10 (dev) version on Fedora 12 ( X86_64 Linux 2.6.31.5 ) . Following are the steps I try out :
1. Stop mhvtl ( /etc/init.d/mhvtl stop )
2. Generate my own device.conf and library_content.<nn> files
3. Start mhvtl
4. Stop mhvtl, delete existing conf file and create new conf files
5. Start mhvtl
6. Stop mhvtl.

I get the following output from dmesg:

mhvtl: timer_intr_handler: Unexpected interrupt, indx 590
BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
IP: [<ffffffff8141a9ca>] _spin_lock_irqsave+0x1f/0x39
PGD 2205e067 PUD 221ff067 PMD 0 
Oops: 0002 [#1] SMP 
last sysfs file: /sys/devices/pseudo_0/adapter0/host3/target3:0:1/3:0:1:0/scsi_generic/sg11/uevent
CPU 0 
Modules linked in: fuse ch osst st mhvtl sunrpc vmblock vsock vmmemctl vmhgfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_multipath uinput snd_ens1371 gameport snd_rawmidi vmci snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer ppdev e1000 snd parport_pc shpchp soundcore snd_page_alloc i2c_piix4 parport i2c_core mptspi mptscsih mptbase scsi_transport_spi floppy [last unloaded: speedstep_lib]
Pid: 1462, comm: scsi_eh_3 Not tainted 2.6.31.5-127.fc12.x86_64 #1 VMware Virtual Platform
RIP: 0010:[<ffffffff8141a9ca>] [<ffffffff8141a9ca>] _spin_lock_irqsave+0x1f/0x39
RSP: 0018:ffff8800324e5e10 EFLAGS: 00010046
RAX: 0000000000010000 RBX: ffff88002a0e3e00 RCX: ffff880032420410
RDX: 0000000000000246 RSI: ffff8800324e5e90 RDI: 0000000000000088
RBP: ffff8800324e5e10 R08: 0000000000000000 R09: ffff8800019c9190
R10: ffffffff816f6fc0 R11: ffff88003a7d3330 R12: 0000000000000088
R13: ffff8800324e5ea0 R14: 0000000000000000 R15: ffff8800324b6080
FS: 0000000000000000(0000) GS:ffff8800019b9000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000088 CR3: 00000000221d0000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process scsi_eh_3 (pid: 1462, threadinfo ffff8800324e4000, task ffff880032451780)
Stack:
ffff8800324e5e50 ffffffffa02ec0ef 00000000015532d0 ffff8800324b6000
<0> ffff88002a0e3e00 ffff8800324e5ea0 ffff8800324e5e90 ffff8800324b6080
<0> ffff8800324e5ee0 ffffffff812bbb42 ffff880032451b48 0000000000015600
Call Trace:
[<ffffffffa02ec0ef>] vtl_abort+0x63/0xee [mhvtl]
[<ffffffff812bbb42>] scsi_error_handler+0x2c5/0x566
[<ffffffff812bb87d>] ? scsi_error_handler+0x0/0x566
[<ffffffff81067765>] kthread+0x91/0x99
[<ffffffff81012daa>] child_rip+0xa/0x20
[<ffffffff810676d4>] ? kthread+0x0/0x99
[<ffffffff81012da0>] ? child_rip+0x0/0x20
Code: 83 2f 01 79 05 e8 37 7a de ff c9 c3 55 48 89 e5 0f 1f 44 00 00 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 b8 00 00 01 00 <3e> 0f c1 07 0f b7 c8 c1 e8 10 39 c1 74 07 f3 90 0f b7 0f eb f5 
RIP [<ffffffff8141a9ca>] _spin_lock_irqsave+0x1f/0x39
RSP <ffff8800324e5e10>
CR2: 0000000000000088
---[ end trace c34adaa4ce2deefd ]---

Can you please let me know why the issue is coming up ?
raviheg
Registered
 
Posts: 7
Joined: Thu Aug 05, 2010 2:08 am

Re: Issue with continous stop and start of mhvtl

Postby rami766 » Mon Sep 13, 2010 8:33 am

This could be a bug, but Mark H has to confirm.

Can you test the following to get around it ? :

CODE: SELECT ALL
/etc/init.d/mhvtl stop
/etc/init.d/mhvtl shutdown 


shutdown: unloads the kernel module
Rami
rami766
Member
 
Posts: 42
Joined: Sat Aug 14, 2010 12:04 am

Re: Issue with continous stop and start of mhvtl

Postby raviheg » Mon Sep 13, 2010 9:02 am

I have tried with shutdown too. But rmmod does not remove mhvtl as it is still in use for some reason. However looks like I am getting it to work. I have put a sleep of 30 seconds after stop before I start again. And I have put a sleep of 10 seconds after start and it seems to work. So far I have tried 20 continues cycle of start and stop... working so far.. Will post If I face any hurdle.
raviheg
Registered
 
Posts: 7
Joined: Thu Aug 05, 2010 2:08 am

Re: Issue with continous stop and start of mhvtl

Postby markh794 » Mon Sep 13, 2010 8:34 pm

Looking at the kdump:
Call Trace:
[<ffffffffa02ec0ef>] vtl_abort+0x63/0xee [mhvtl]
[<ffffffff812bbb42>] scsi_error_handler+0x2c5/0x566
[<ffffffff812bb87d>] ? scsi_error_handler+0x0/0x566
[<ffffffff81067765>] kthread+0x91/0x99
[<ffffffff81012daa>] child_rip+0xa/0x20
[<ffffffff810676d4>] ? kthread+0x0/0x99
[<ffffffff81012da0>] ? child_rip+0x0/0x20
Code: 83 2f 01 79 05 e8 37 7a de ff c9 c3 55 48 89 e5 0f 1f 44 00 00 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 b8 00 00 01 00 <3e> 0f c1 07 0f b7 c8 c1 e8 10 39 c1 74 07 f3 90 0f b7 0f eb f5
RIP [<ffffffff8141a9ca>] _spin_lock_irqsave+0x1f/0x39
RSP <ffff8800324e5e10>
CR2: 0000000000000088
---[ end trace c34adaa4ce2deefd ]---

It looks like the driver was removed while a SCSI command was in flight (i.e. a SCSI command entered the scsi_handler)..

Any better analysis welcome.
Any suggestions on how to fix would be even more welcome.

As I do not normally attempt to unload/reload user-space daemons that often, I can't say I've encountered this issue.
Or if I have unload the user-space daemons, it is normally 5+mins before I re-load the daemons. Perhaps it's a timing issue..
markh794
MHVTL - Developer
 
Posts: 101
Joined: Sat Feb 20, 2010 6:30 pm
Location: Sydney, Australia

Re: Issue with continous stop and start of mhvtl

Postby ap2010 » Tue Sep 14, 2010 2:10 am

Looking at 
Pid: 1462, comm: scsi_eh_3 Not tainted 2.6.31.5-127.fc12.x86_64 #1 VMware Virtual Platform

I would say that this is a Linux install on a VMware platform, running mhvtl.

It could be a problem with that.

Just my two cents.

Albert
ap2010
Member
 
Posts: 14
Joined: Thu May 13, 2010 5:18 am

Re: Issue with continous stop and start of mhvtl

Postby markh794 » Tue Sep 14, 2010 6:10 am

While I'd like to pass the blame, I normally run the VTL within a VMWare (Server 1.0.10 or 2.0.2) guest environment..

I have to admit it is more likely something I'm not doing correctly with registering/unregistering devices and allowing SCSI commands to sneak thru a gap while the devices are being torn down.

Cheers
Mark
markh794
MHVTL - Developer
 
Posts: 101
Joined: Sat Feb 20, 2010 6:30 pm
Location: Sydney, Australia