3

I'm frequently getting "SIGSEGV" and "Segmentation Fault" errors when running various applications on Ubuntu 22.04 (dozens of times a day). This has been happening since I got a new PC a year ago.

It happens literally everywhere: (primarily) CPU/RAM-intensive CLI applications, web browsers (Chrome and Firefox), update manager, Nautilus, VSCode etc.


Initially I suspected the RAM to be faulty but memtest86 passes 100%. I also tried taking out RAM sticks one by one (I have 2) but it didn't change anything.

I ruled out SSD problems because I have 2 Ubuntu installations on 2 different drives and they both suffer from this problem.

Then I suspected the CPU to be at fault because I've been getting failures in prime95 (both Windows and Ubuntu). I contacted Intel's support and eventually they instructed me to change SVID setting in BIOS to "Intel's fail safe" - this solved the CPU failing prime95 tests. Unfortunately it did not fix the problem with apps/programs crashing under Ubuntu.

In the end, Intel's support recommended I contact Asus support for further debugging, and Asus support directed me here.


Here's basic info about my system:

  • Motherboard: Asus ROG STRIX Z790-F GAMING WIFI
  • CPU: Intel i9-13900K
  • RAM: Corsair DOMINATOR® PLATINUM RGB 32GB (2x16GB) DDR5 DRAM 6200MT/s CL36
  • GPU: Asus ROG Strix GeForce RTX™ 4090 24GB GDDR6X
  • PSU: Asus ROG STRIX 1000W Gold Aura Edition

I have the latest BIOS. I have the latest OS updates. I reset the CMOS. Currently my only non-default BIOS settings are:

  • SVID "Intel's fail safe" (otherwise prime95 tests fail)
  • XMP enabled (I tried all options though, disabled too)
  • Fast boot disabled (I tried enabled too)
  • Secure boot disabled (I need this to enable CSM)
  • CSM enabled (I need this enabled, otherwise I get no display during boot)

I'm multi-booting: I have 2 Ubuntu 22.04 installations and Windows 10. Windows 10 is on the same drive as one of the Ubuntus. Both Ubuntu installations are affected.

Things seem to work fine in Windows 10. I only use it for gaming and some games will occasionally crash but nowhere near the frequency of crashes on Ubuntu, so I'd say the problem does not occur there (since changing the SVID configuration).

I also tried Fedora Workstation 38 live USB and I get the very same crashes there as well.


Here are the reproduction steps that will cause the error 99% of the time for me (I literally only ever had this pass twice, right after fresh system reboot):

Prerequisites: git and node.js v18

npm install -g yarn
git clone https://github.com/metabase/metabase.git
cd metabase
yarn
git checkout rfc/sort-imports
./node_modules/.bin/eslint --ext .js,.jsx,.ts,.tsx --rulesdir frontend/lint/eslint-rules --max-warnings 0 --report-unused-disable-directives enterprise/frontend/src frontend/src frontend/test e2e/test

I have another machine (laptop) with Ubuntu 22.04 where the above reproduction steps never throw any errors.


I'm deducing/guessing it might be Ubuntu not fully compatible with some of the hardware or some BIOS settings not playing well with Ubuntu.

9
  • Assuming you mean SIGSEGV, afaik that's a software issue not a hardware one - it basically means that a program attempts to access a memory location that doesn't belong to it - see for example What causes a SIGSEGV?. Possibly there is a mismatch between ./node_modules/.bin/eslint and one of the shared libraries that it links? Commented Nov 7, 2023 at 0:44
  • Yes, SIGSEGV - edited, thank you. Is there a way to debug and fix this? I also get the errors on a clean Ubuntu installation with stock Ubuntu software (e.g. nautilus, software updater).
    – Kamil
    Commented Nov 7, 2023 at 6:32
  • I'm running very similar hardware (z790-h & 14900K) and have the same problem (random segfaults when compiling rust code). For me it's OS independent, happens on Ubuntu but also on Windows. My other machine works fine. I've disabled XMP, Asus ME, and reset CMOS many times with no luck. Also removed the GPU, removed all but one RAM stick (swapped them too). For me it's either a faulty CPU or mainboard and I don't know the best way to proceed. Have you found a solution/answer yet?
    – samvdst
    Commented Jan 18 at 20:09
  • 1
    @samvdst Yes, it was a faulty CPU after all. Initial conclusion after contacting Intel support was that it's an OS or motherboard problem, but later on I discovered that by decreasing the multiplier of all performance cores to 52x all of the problems go away. This basically proved it's a CPU problem. Intel replaced my unit with a new one and now everything works.
    – Kamil
    Commented Jan 20 at 7:45
  • 1
    Nevermind, it just got a little more stable. I still get segfaults 1/20 times. Time to replace the CPU.
    – samvdst
    Commented Jan 24 at 23:31

0

You must log in to answer this question.

Browse other questions tagged .