Famine

I haven’t been able to play games this week, for reasons unrelated to the move.

The issue started when I booted up WoW. I was able to play for about 5-10 minutes – enough time to try purchasing more WoW Tokens – but then the game froze for a moment, recovered, then after another 5 minutes my PC just reset. Immediately afterwards, I heard the fans on my GTX 970 turn on full blast but there was no monitor signal. About a minute later, the PC sounds like it is functioning okay, but again, no monitor signal. I reboot and the PC turns on fine.

“Weird,” I think. Turn on WoW again, and the PC resets again after 5 minutes. Again, the 970 fans kick into high gear. This time though, resetting the PC does not bring the monitor back up.

I moved the 970 card to another slot, I used a different DVI port, I unplug the PC from my UPS to plug directly into the wall outlet, and so on. Plugging the monitor cable directly into the motherboard slot, e.g. the integrated graphics card, works however. I eventually dig out my old 560ti card, and that also works. Okay, great. Maybe the 970 card died?

I run Furmark to stress the 560ti card to reproduce the error. Temps stabilize around 70c. I ran a CPU stress, and they are within acceptable limits as well. Boot up WoW, play for another 10 minutes, and hard reset yet again. “Okay, not playing WoW, gotcha.” I turn the PC back on, start filling out a RMA request for the 970 card, and my PC hard resets while inside a browser. As of today, the monitor will display nothing no matter what card or slot the cord is plugged in.

Now, the primary culprit is currently suspected to be the PSU. According to a coworker, the 12V “bridge” which powers graphics cards often goes out first. I ended up inadvertently ordering two PSUs – one was 500W, which is not likely enough to power the rig, but Amazon wouldn’t let me cancel the order – and they should be arriving soon. I’ll have to cross my fingers regarding the GTX 970 card still being functional. It is technically within the 2-year warranty/RMA period, but I’m expecting some potential pushback should the card be fried when there were PSU issues afoot.

Of course, it might be something else entirely. Maybe the motherboard.

I don’t consider myself to be particularly tech savvy as much as software savvy. Replacing the PSU is going to be the most complicated PC-related task I have completed since I installed the GTX 970 card in the first place. Prior to that, the most complicated task was installing a new SSD. It is not so much the physical actions that worry me, but rather than possibility (and repercussions) of failure. “Oops, I bricked the $300 video card.”

In any case, we’ll see how it goes. I dive into the case Thursday night.

Edit: I spent a little over 2 hours last night replacing the PSU. GTX 970 card still is not working. Moved the entire computer to a different outlet, same deal. Swapped in the 560ti card and everything booted up fine. Then I went for the test: logging into WoW. About 10 minutes in, power shutdown. Plugged monitor into integrated video slot, left the machine idle for a bit. Power cut off again.

At this point, I am at a loss. The rig is old – I bought it around 5 years ago or so – but I no longer have any idea what could be wrong. I’m going to try and take it to a Microcenter near me for diagnosis this weekend. Depending on the verdict… I dunno. Maybe I “cut” losses and go back to gaming laptop like I did in my computer prior to this one. Maybe I try to save what I got.

All I can say for the moment is that I feel lost. Everyone has a thing that keeps them centered when the rest of life gets weird. Sometimes it’s a person, sometimes it’s a game, sometimes it’s a phone. Mine is a functioning computer. Or perhaps a personalized virtual space, if I want to be more specific. I don’t have that for the moment, and it sucks.

Advertisements

Posted on February 10, 2017, in Miscellany and tagged , , , . Bookmark the permalink. 5 Comments.

  1. It’s VERY hard to diagnose at distance and with indirect information.

    My thought process (similar to yours):
    – PC crashes in games => video card overheating => check video card fans
    – replace video card and test again (or use on-board video output), no crash = the original video card has temperature issues
    – since it ALSO crashes with another video card and it crashes with non-graphics intensive applications => system overheating, either the CPU or the chipset, but it’s strange that you get signal with one video card and not the with the other: if it’s the motherboard it should be crash 5 mins later whatever the video card.
    – PSU being overloaded by the initial video card doesn’t hold either, browsing is not particularly intensive, it should crash if you do 3D, not if you idle.

    I cannot figure a scenario which matches the data:
    – computer crashing late/under load usually points to overheating of some component. But this should not depend much on which video card you use, and it should be proportional to how much you load the system. Power up, intensive use => crash fast. Power up, low use => crash later. You scenario doesn’t seem to match this.
    But there’s one thing you don’t mention and which is important for heat-related issues: how long do you wait when replacing cards. At the same time you don’t seem very confident about switching parts in and out, so I guess you do it slowly, giving the system plenty of time to cool down whatever you do.
    You can do this test: boot up the PC, go into BIOS and DO NOTHING. Even better, open the bios page which shows the voltages and temperatures. Wait and see if it crashes (or if some values become weird). If it doesn’t it more or less rules out overheating. Now the problem is that, again, which video card you put in should not matter. Video cards do little during boot up except drain power, but replacing the PSU rules that out. What happens if you put in the 970 while the onboard display is enabled and you use it? Still dead?

    Of course the problem is that if there are TWO causes it becomes a mess, and this is not as impossible as it may seem…. what I mean is that maybe something went wrong and killed the 970 / damaged the motherboard, so now any diagnosis you attempt with the video card is irrelevant, because it is caused by one problem and so it completely hides the other (and makes guessing harder….).

    Other random attempts:
    – put the 970 in and DON’T connect the extra power supply cord to the video card. The driver detects this and runs the card in low-end mode (= you cannot play), but booting should be possible.
    – I have no idea how the card is powered when you start up, if something went wrong with the PCI-E bus and it cannot supply a card draining more power, it could explain why it boots with an older card but not with a newer.

    Ideally (but it’s too late), when this kind of stuff starts happening, the first thing you do is disconnect any disk which has important data. Damaged PSU/mobo/etc. can propagate the damage, and parts can be replaced, unsaved data can’t…..

    Hmmm… wall of text which I fear won’t be very useful…..

    Like

    • I’ve been talking with a coworker some more, and our thinking is more towards my (closed) liquid cooling system for the CPU at this point. Specifically, I haven’t done any maintenance to it beyond dusting since I bought the PC in 2011; didn’t even know I had to, since it’s a closed loop. So perhaps there is air in there somehow (certainly no leaks I can see), and the CPU is overheating. Makes sense that WoW causes it to crash sooner since it’s a processor hog, although I can’t really explain the Furmark CPU tests running fine for 5+ minutes. Coworker thinks that the CPU temp readings might be off if the gauges are registering the motherboard temps rather than CPU directly.

      If the CPU is overheating and shutting the system down, frying the 970 card by proxy also makes sense, as the 970 fans wouldn’t have been able to cool things down post-shutdown. Then again, could be bad RAM, fried motherboard, etc etc etc.

      At this point though, I am kinda done self-diagnosing slash buying components blindly. Microcenter will charge $40 to look at it, then X amount to repair depending, and the whole process is likely to take 3 days. Could I try and do these things myself? Sure. I could also have started the clock 3 days ago and had something to play over the weekend by now.

      Thanks for the advice though. I should have posted something earlier in the week when this all started, and avoided the PSU purchase potentially.

      Like

      • RAM is about the easiest thing in the world to test, and frankly is always the first thing I look after when these sorts of issues arise.

        Get yourself a bootable disc/usb with good old MEMTEST86 on it and let it run a few passes before you bother with a shop.

        Like

      • Ah you didn’t mention the liquid cooling…. since you talk about GPU fans, it’s CPU-chipset only I guess. And yes, they need maintenance, not just to make sure that there are no air bubbles, but even with additives you get algae growing in there, so periodically you need to purge/clean/replace the liquid. It’s a good explanation of the symptoms, except for the 970.

        I don’t believe the CPU shutdown killing the graphics card. The fans at full power are usually the result of the fact that at boot time, before any decision can be made, the cooling systems are set by default to the most paranoid mode, so any heating even minimal puts the system at full power. My mobo does the same with the CPU fan at boot time. Then, as soon as the bios starts up and configs thing correctly, the cooling stops overreacting and the fans speed drops to normal.

        To help in diagnosing this stuff, a good thing is to have a second PC you can use to switch parts in and out. Even if it’s potentially dangerous, it allows you to rule out problems with specific components. I now always keep the “old PC” around, to use as an emergency system or for web navigation for guests.

        Like

  2. I’d look at it as: you got a good run out of a PC for 5 years, it was due for an upgrade, so upgrade and get a new one.

    With liquid cooling never been maintained, you are pretty lucky to get 5 years out of it IMO.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: