freeze - 12.10 - GPU lockup

19
2014-04
  • sunil

    I just downloaded the Ubuntu 12.10 and i am trying to install on my laptop with Nvidia chipset, on previous version i was able to install on through nomodeset command and the i use to install later on .. the last i used was on ubuntu 11.

    and now i am stuck with ubuntu 12.10 hanging on primary boot.. with GPU lockup error and i know its the graphic card bug. but every through grub when i go in failsafe x option, system just sits there and its stuck..

    Please help for this one.

  • Answers
    Know someone who can answer? Share a link to this question via email, Google+, Twitter, or Facebook.

    Related Question

    12.04 - Is there a simple, safe way to trigger a GPU lockup on a susceptible computer?
  • Abe

    Answers to my previous question, Ubuntu 12.04 froze, requiring powercycle. What should I look / grep for in the logs?, have led me to suspect that my computer is experiencing an intermittent GPU lockup. It has been happening about once a week, usually when I am using Chrome. Today it happened when I was creating a diagram on lucidchart

    I have a Dell Optiplex 755 with an ATI Radeon HD 2400 XT and dual monitors running in Xinerama mode. I am using 12.04 with the proprietary ATI driver installed.

    When the computer locks-up, I can still ssh in. And I would like to follow the instructions on reporting this provided at https://wiki.ubuntu.com/X/Troubleshooting/Freeze

    Is there a (safe) way to cause a GPU lockup so that I can go ahead and file a bug, rather than waiting until it happens again?


  • Related Answers
  • Bryce

    Excellent question.

    Workloads

    The /usr/share/xdiagnose/workloads directory has a set of workloads designed to exercise your graphics system to trigger lockups.

    $ ls /usr/share/xdiagnose/workloads/
    README                       do_monitor_rotation_loop
    do_chws_loop*                do_screensaver_loop*
    do_cpu_spin_loop             do_video_loop*
    do_disk_write_loop           do_vtswitch_loop*
    do_glx_loop*                 repro.sh
    do_kernel_compile_loop       run_workloads
    do_monitor_disable_loop*     youtube-loop.html
    do_monitor_resolution_loop*  youtube-reload.html
    

    Note that to run them you need to pass 'run'. E.g.:

    $ do_glx_loop run

    With no args the scripts will display usage. Partly that's for safety (in case people just blindly run the scripts), but mostly it's to keep the scripts' API tidy.

    The ones I've starred are probably the best ones to start with. I would start by running just one script at a time and let it go a few hours. If your system survives that well enough, then try running two or more simultaneously.

    Note I haven't tested these super heavily myself, so can't promise they're bug free. But they're quite short and simple scripts so hopefully easy to fix up, and well patches are very much welcomed.

    Also note that they quite likely may trigger lockups unrelated to the one you're trying to solve. GPU lockups all generally look identical to the untrained eye since they have the exact same symptoms, more or less.

    Logs

    If you're on Intel Graphics, there is a /sys/kernel/debug/dri/0/i915_error_state that you want. This is a snapshot of the register state at time of hang, and the top of it contains some error codes. IPEHR, PGTBL_ER, ESR, EIR. Match those codes up to see if you have the same or similar error.

    If you're not on Intel Graphics (as in this case you're not), or if you're not seeing i915_error_state files generated, then dmesg and /var/log/kern.log are what to look at. Sometimes with gpu lockups they will indicate what the GPU lockup was caused by or in.

    The open source -ati driver has radeontool and avivotool, which capture register states. These are primarily for the opensource -ati, but the tools should also work with -fglrx. I've never seen it requested for an -fglrx bug, but it certainly can't hurt.

    Testing

    For all drivers, the next step is usually to start testing either newer or older versions of the driver. For proprietary drivers, you can check the x-updates ppa but probably you'll have to download and manually install the driver from the vendor website (and mess up your system's packaging in so doing). For FOSS drivers like -intel, -nouveau, -ati that means testing either newer kernels or newer mesa. We provide packaged builds of newer kernels at http://kernel.ubuntu.com/~kernel-ppa/mainline/. For mesa, there are various PPAs such as xorg-edgers. I'm also in process of preparing an 8.0.3 update for precise, which we believe fixes a number of lockups for Intel Graphics.

    In any case, don't just stop when you find a version that works. Try other versions in between your working version and the broken one. If you can narrow the bracket down to two adjacent versions, that can be hugely helpful to the developers in isolating what patch caused the regression.

    Contributing

    As you go through the troubleshooting you might spot errors, or might come up with improvements for the scripts or docs. Contributions to any of these are warmly welcomed. With the wiki docs, please do just go ahead and edit! I try to update them at least once a year, but I don't always get around to it, and the next guy to visit the page will certainly appreciate your effort at improving them.

    For changes to the scripts themselves, also quite welcomed. Send me changes however you feel comfortable - as patches, a bzr or git branch, or even just copies of the script. If you plan to do a lot of changes, a bzr branch with a merge proposal is the preferred way; tutorials on how to do this are available at code.launchpad.net, or feel free to catch me on IRC if you have questions.

    Or, if you're not ready to dig into coding but would like to flag errors or areas where more functionality is needed, you can file bug reports the usual way (ubuntu-bug xdiagnose).

    Quick Fixes

    If you're not interested in doing any of the above debugging, here's some random tips:

    For proprietary drivers, try uninstalling and purging them completely from your system, then reinstalling from scratch. This unfortunately "solves" a lot of bugs...

    For the FOSS drivers, there are various kernel switches you can play around with. For 3D/mesa bugs, there is also driconf to tweak various settings.

    Finally

    Finally, one request... please don't file bug reports to Launchpad about "random freezes" until you've done at least a little sleuthing such as described above. Otherwise, you'd just be adding to the noise.

    We do try to fish out well researched bug reports; we find these to give higher bang for the buck, and are a lot more likely to end up with an actual fix for the distro.