You are here

Relax... everything is fine.

For current or would-be users of Apple Mac computers, with answers to many FAQs.

Moderator: Moderators

Re: Relax... everything is fine.

Postby merlyn » Tue Apr 13, 2021 12:50 am

No, don't worry.

GPU Die -- Analog and GPU Vcore can't both be right. One of them is spurious. If the voltage is 0 the GPU isn't doing anything.

The happier place to be is to believe GPU Vcore and ignore the temperature.
merlyn
Regular
Posts: 312
Joined: Thu Nov 07, 2019 3:15 am
It ain't what you don't know. It's what you know that ain't so.

Re: Relax... everything is fine.

Postby desmond » Tue Apr 13, 2021 11:19 am

merlyn wrote:GPU Die -- Analog and GPU Vcore can't both be right. One of them is spurious. If the voltage is 0 the GPU isn't doing anything.

Yep, the GPU is intentionally disabled, hence the 0V.

merlyn wrote:The happier place to be is to believe GPU Vcore and ignore the temperature.

True, but the sensor *is* reading - for instance, this morning the temp was running around 80 degs on casual use, and I ran Logic (at the beginning of the graph below) to test some plugins and as the CPU was working, the CPU was generating heat (from about 55 to about 85 degrees, which is pretty normal) and the GPU temp sensor again climbed and flatlined at 128, until I cranked up the fans and finished what I was doing, and after that the temps lowered again. So it's not like it's not reading temperature changes as such.

Image

Without knowing how these sensors work and where they are placed - presumably they can't be on the GPU die because without power they wouldn't work at all - it's difficult to know what's going on and how to read whether it's a real problem or not, which is a bit frustrating...
User avatar
desmond
Jedi Poster
Posts: 11435
Joined: Tue Jan 10, 2006 1:00 am
mu:zines | music magazine archive | difficultAudio

Re: Relax... everything is fine.

Postby Folderol » Tue Apr 13, 2021 11:49 am

They could indeed be on the die, and very likely are. The simplest, and fairly linear type is a reverse biased diode, typically -2mV/degC (depending on current).
On the die, this can be placed right at the hottest part of the chip
User avatar
Folderol
Jedi Poster
Posts: 13062
Joined: Sat Nov 15, 2008 1:00 am
Location: The Mudway Towns, UK
Yes. I am that Linux nut.
Onwards and... err... sideways!

Re: Relax... everything is fine.

Postby Wonks » Tue Apr 13, 2021 11:50 am

They will undoubtedly be NTC thermistors, which are a very cheap solid-state temp sensor. Their resistance goes down as the temperature increases. So a higher indicated temperature means either a lower measured resistance (maybe a partial short circuit), or the reference voltage used to put across them is a bit out of whack. e.g. if the circuit was supposed to have 3.3v across it but instead has 3.1v, then you can get significantly different temperatures reported by the software.

I found this Intel paper on on-chip temperature monitoring, which may provide some more background information for you.

https://www.intel.com/content/dam/www/p ... -paper.pdf
User avatar
Wonks
Jedi Poster
Posts: 11630
Joined: Thu May 29, 2003 12:00 am
Location: Reading, UK
Correcting mistakes on the internet since 1853

Re: Relax... everything is fine.

Postby desmond » Tue Apr 13, 2021 11:56 am

Useful, thanks! :thumbup:
User avatar
desmond
Jedi Poster
Posts: 11435
Joined: Tue Jan 10, 2006 1:00 am
mu:zines | music magazine archive | difficultAudio

Re: Relax... everything is fine.

Postby Folderol » Tue Apr 13, 2021 11:56 am

Interesting. That means Intel are way behind the curve. I'm pretty certain the ARM chips use diodes - don't know about AMD.
User avatar
Folderol
Jedi Poster
Posts: 13062
Joined: Sat Nov 15, 2008 1:00 am
Location: The Mudway Towns, UK
Yes. I am that Linux nut.
Onwards and... err... sideways!

Re: Relax... everything is fine.

Postby desmond » Tue Apr 13, 2021 12:09 pm

I tried a different temperature reporting tool.

This one reports "N/A" once the temperature reaches 110, but otherwise agrees with the temperature readings from iStatPro.

Image

I would have thought the temp sensor can't be on die if the GPU is powered down, but the sensor is still working?

I guess the better explanation is likely to be as Wonks suggests - some recent-ish changes in resistance/etc from around the faulty GPU area that's throwing the temperature reading off, even though the sensor is responding to temperature changes...
User avatar
desmond
Jedi Poster
Posts: 11435
Joined: Tue Jan 10, 2006 1:00 am
mu:zines | music magazine archive | difficultAudio

Re: Relax... everything is fine.

Postby desmond » Tue Apr 13, 2021 3:50 pm

Folderol wrote:They could indeed be on the die, and very likely are.

Actually, looking at how the two GPU sensors are labelled, they are called:

GPU Die - Analog (or "GPU Diode" - this is the problem sensor)
GPU Die - Digital (this is non-functional and gives no reading, presumably because the GPU is disabled)

So maybe the "digital" sensor is the on-die sensor, and the "analog" problem one is a separate thermistor associated to the GPU, and misbehaving as per Wonks' suggestion.
User avatar
desmond
Jedi Poster
Posts: 11435
Joined: Tue Jan 10, 2006 1:00 am
mu:zines | music magazine archive | difficultAudio

Re: Relax...

Postby tea for two » Thu Apr 22, 2021 6:31 am

desmond wrote:Image

It's ok, it's probably not at 128 degrees.

It's probably much higher, it's just the sensor maxes out at 128...

Hurry up and release some new M1 MBP's Apple, before my machine melts through the desk... :shocked: :?

Clearly this gpu got all hot n flustered watching Frankie Goes to Hollywood.
tea for two
Frequent Poster
Posts: 876
Joined: Sun Mar 24, 2002 1:00 am

Previous