OnWorks Linux and Windows Online WorkStations

Logo

Free Hosting Online for WorkStations

< Previous | Contents | Next >

6.6. Testing the Crash Dump Mechanism


image

Testing the Crash Dump Mechanism will cause a system reboot. In certain situations, this can cause data loss if the system is under heavy load. If you want to test the mechanism, make sure that the system is idle or under very light load.


Verify that the SysRQ mechanism is enabled by looking at the value of the /proc/sys/kernel/sysrq kernel parameter :


cat /proc/sys/kernel/sysrq


If a value of 0 is returned the dump and then reboot feature is disabled. A value greater than 1 indicates that a sub-set of sysrq features is enabled. See /etc/sysctl.d/10-magic-sysrq.conf for a detailed description of the options and the default value. Enable dump then reboot testing with the following command :


sudo sysctl -w kernel.sysrq=1


Once this is done, you must become root, as just using sudo will not be sufficient. As the root user, you will have to issue the command echo c > /proc/sysrq-trigger. If you are using a network connection, you will lose contact with the system. This is why it is better to do the test while being connected to the system console.

This has the advantage of making the kernel dump process visible. A typical test output should look like the following :


sudo -s



[sudo] password for ubuntu:

# echo c > /proc/sysrq-trigger


[

31.659002]

SysRq : Trigger a crash

[

31.659749]

BUG: unable to handle kernel NULL pointer dereference at

[

31.662668]

IP: [<ffffffff8139f166>] sysrq_handle_crash+0x16/0x20

[

31.662668]

PGD 3bfb9067 PUD 368a7067 PMD 0

[

31.662668]

Oops: 0002 [#1] SMP

[

31.662668]

CPU 1

[

31.659002]

SysRq : Trigger a crash

[

31.659749]

BUG: unable to handle kernel NULL pointer dereference at

[

31.662668]

IP: [<ffffffff8139f166>] sysrq_handle_crash+0x16/0x20

[

31.662668]

PGD 3bfb9067 PUD 368a7067 PMD 0

[

31.662668]

Oops: 0002 [#1] SMP

[

31.662668]

CPU 1

(null)


....


The rest of the output is truncated, but you should see the system rebooting and somewhere in the log, you will see the following line :


Begin: Saving vmcore from kernel crash ...


Once completed, the system will reboot to its normal operational mode. You will then find the Kernel Crash Dump file, and related subdirectories, in the /var/crash directory :


ls /var/crash

201809240744 kexec_cmd linux-image-4.15.0-34-generic-201809240744.crash


If the dump does not work due to OOM (Out Of Memory) error, then try increasing the amount of reserved memory by editing /etc/default/grub.d/kdump-tools.cfg. For example, to reserve 512 megabytes :


GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT crashkernel=384M-:512M"


run sudo update-grub and then reboot afterwards, and then test again.


Top OS Cloud Computing at OnWorks: