Basically the title. I’ve run into an issue where I’m starting to have memory errors all over the place, even after replacing all of the ram sticks, and adding ram back one by one and testing. This last time, no errors with memory until about 3 hours in, and watching something from my plex server. This was after testing and excluding ram from the slots that were failing, as I assumed the slots are bad since other new ram was failing in the same slot.
I can give logs or anything needed, but how exactly do I know if the motherboard itself is on its way out and needing replacing?
To follow that up, what motherboard can I just swap out for this one without having to start from scratch?
I really appreciate the help guys, I’m really at a loss!
If you see errors in those logs, start by actually searching those errors online. I cannot tell you how many times I’ve been saved with 10 minutes of searching the one error in my syslog!
If there are no errors, you may need to diagnose the hard way. The way I’d approach it, if I had no other clue, is via what’s called a Burn-In test. If it was me, I’d:
- Disconnect all add-in cards and drives from the board
- Go down to the smallest set of RAM you can boot from
- Save off current BIOS settings
- Turn off everything non-essential from BIOS
- Download and setup a flash drive to be able to run a Burn-In tool, and run it
- If you still see the issue, and you are dual-CPU’ed, pull one CPU and re-run
- If you don’t see the issue, start adding components back, in stages
Hopefully the above can help you narrow down which component is actually failing.
To help answer the follow-up – what OS are you running?
I appreciate the detail! I’m running the latest version of Unraid.
The errors I have been getting (and googled) point to ram errors, but it was still happening even after replacing all of the ram. which now points me to the memory slots themselves. But with them both going bad simultaneously, I was worried it could mean the board itself is bad.
When I get home I’ll upload some of the logs, as I believe there was another error that popped up that may have been the CPU.
if one of the CPU’s was going bad, would that cause it to throw memory errors with the ram?