Fanshim automatic.py preempt bug

I think there’s a bug in autmatic.py

My output>

sudo python3 automatic.py --threshold 60 --hysteresis 10 --delay 1 --preempt --verbose
Current: 38.46 Target: 60.00 Freq 1.50 MaxFreq 1.50 Automatic: True On: False
Current: 38.46 Target: 60.00 Freq 0.60 MaxFreq 1.50 Automatic: True On: True
Current: 38.95 Target: 60.00 Freq 0.60 MaxFreq 1.50 Automatic: True On: True
Current: 37.97 Target: 60.00 Freq 0.60 MaxFreq 1.50 Automatic: True On: True
Current: 38.95 Target: 60.00 Freq 0.60 MaxFreq 1.50 Automatic: True On: True
Current: 37.97 Target: 60.00 Freq 0.60 MaxFreq 1.50 Automatic: True On: True
Current: 39.43 Target: 60.00 Freq 0.60 MaxFreq 1.50 Automatic: True On: True
Current: 37.97 Target: 60.00 Freq 0.60 MaxFreq 1.50 Automatic: True On: True
Current: 38.46 Target: 60.00 Freq 0.60 MaxFreq 1.50 Automatic: True On: True
Current: 38.46 Target: 60.00 Freq 0.60 MaxFreq 1.50 Automatic: True On: True
Current: 38.46 Target: 60.00 Freq 0.60 MaxFreq 1.50 Automatic: True On: True
Current: 38.46 Target: 60.00 Freq 0.60 MaxFreq 1.50 Automatic: True On: True

From code

try:
update_led(fanshim.get_fan())
while True:
    t = get_cpu_temp()
    f = get_cpu_freq()
    if args.verbose:
        print("Current: {:05.02f} Target: {:05.02f} Freq {: 5.02f} MaxFreq {: 5.02f} Automatic: {} On: {}".format(t, args.threshold, f.current / 1000.0, f.max / 1000.0, armed, enabled))
    if abs(last_change - t) > args.hysteresis and armed:
        enable = (t >= args.threshold)
        if args.preempt:
            enable = enable or (int(f.current) == int(f.max))
        if set_fan(enable):
            last_change = t
    time.sleep(args.delay)

(this is your code, with one extra parameter for max frequency)

if abs(last_change - t) > args.hysteresis and armed:
must be true because enabled is evaluated so I presume
last_change is presumably 0, t is around 38, 0-38 is -38. Abs(-38) = 38. 38>10

Then
enable = (t >= args.threshold)
t=38, 38 is less than theshold (60), so enable is false.

preempt is true, so we move to this test
enable = enable or (int(f.current) == int(f.max))
enable is initially false on first pass. However initially there’s a processor surge when initially launching the program, so current frequency initially maxes max frequency.
enabled is turned on.
The fan switches on
last_change is set to current temp, around 38.

if abs(last_change - t) > args.hysteresis and armed:
will be never be true as the last temp and temp are around 38 and the difference will never exceed my hysteresis (10) so enabled will never re-evalute, so enabled will remain on for ever.

There is also a possibility that with a large hysteresis and a short interval, that the current code would never have a large enough temperature change to trigger a reevaluation of enabled.

I’ve taken the liberty to have a go at rewriting.
If preempt and running full speed, fan on.
If over temperature, fan on.
If under (temperature - hysteresis), fan off

I think you’re right, but I’m not sure I’ve entirely got my head 'round how everything should interact yet. I probably made a mistake when I chose a fancy concept like “hysteresis” and combined it with low-frequency samples of the data.

Your code looks pretty good and I think it’s a step in the right direction, but I can’t help but think CPU Frequency and Temperature should both be averaged over a number of samples to prevent intermittent peaks (like the CPU usage at startup) from triggering a fan state change.

It’s amazing how complex fan control can get if you try to get clever about it! Thanks for weighing in.

https://pastebin.com/GZ4Lv6Eg

I agree about averaging processor speed, disagree about averaging temp, because temperature changes slowly.
Revised code only preempts if CPU was at max speed last pass and this pass. I doesn’t guarantee that it was fast for the whole duration of the update interval, but it does cut out transient blips.

Sincerely hope this helps. Not doing this to criticise. I’ve wanted a temperature sensitive fan for ages, and your product & your code that I’ve modded as above has given me what I want.

Many thanks! :-)

1 Like

I wouldn’t mind if you were. My code doesn’t get nearly enough criticism and I really enjoy having extra input into it.

Hopefully I’ll find some time to give this a proper go today.

Are you comfortable using GitHub at all? It might be worth you “forking” the Fan SHIM library, making these changes and raising them as a “Pull Request” so that the changes are duly credited. I don’t want to lift your work out of a pastebin and make it look like mine :D

I’ve raised a draft pull request with a minor variation on your changes- https://github.com/pimoroni/fanshim-python/pull/7/files

I’m checking hysteresis against abs(last_change - t) since this prevents the fan from switching on/off to quickly during both a rising or falling temperature gradient.

Hi,

I’m not-at-all comfortable with github as I’ve hardly used it and don’t currently have time to learn it. Don’t mind if you take the credit.

Re: “checking hysteresis against abs(last_change - t) since this prevents the fan from switching on/off to quickly during both a rising or falling temperature gradient.”

My code is deterministic. Ignoring the “Preempt” code, if temp > threshold, the fan is on.
If temp < (threshold - hysteresis), fan is off.

With your suggested test, you go back to the problem I initially had.
Look at my settings: -threshold 60 --hysteresis 10 --delay 1
If the fan was on, looking at the temperature changes between each reading in my initial output, no change is >2, certainly not >10, so
elif armed and abs(last_change - t) > args.hysteresis:
would never be true and

    if t >= args.threshold:
        enable = True
    elif t < args.threshold:
        enable = False

would never execute. It relies on a large enough change happening to trigger the absolute check against the threshold temperature.

I can see what you’re trying to do, and at first glance it looks like it ought to work, but I don’t think it will.

I can understand and agree with not wanting to turn the fan on and off too frequently. To a certain extent, this is already countered by the delay parameter. You could ameliorate this by keeping an average of the last few temperature readings, and comparing that average with the threshold - however, I think the instantaneous temperature is more important, because if the current temperature is too high, you need to turn the fan on now, not in a while when the average catches up.

You could add another timer, to enforce not changing fan state until a time limit has expired, but that’s basically what delay does anyway.

What do you think?

Note to self: Must learn github!

Above all else, this struck me as a really important point and there’s no way I can reasonably disagree.

I now see where you’re coming from, and suggest we should probably just drop hysteresis altogether and replace it with a minimum temperature setting- which is, I think, effectively what you were already doing save for renaming it. Dropping the less-known buzzword would make it easier to understand, too.

So, yes, TLDR: You’re right!

I think most of the fan controller kernel drivers use a “min” and “max” temperature and scale the fan speed between those two, also.