Noisy Fans on Edgecore

  • 1
  • Problem
  • Updated 1 week ago

Any idea how I turn down the fans on this switch



root@cumulus:~# net show ver


NCLU_VERSION=1.0


DISTRIB_ID="Cumulus Linux"


DISTRIB_RELEASE=3.6.0


DISTRIB_DESCRIPTION="Cumulus Linux 3.6.0"


root@cumulus:~# net show sys


Accton AS7712


Cumulus Linux 3.6.0


Build: Cumulus Linux 3.6.0


 


Chipset: Broadcom Tomahawk BCM56960


Port Config: 32 x 100G-QSFP28


CPU: (x86_64) Intel Atom C2558 2.40GHz


Uptime: 0:51:43.080000


root@cumulus:~#


e: smonctl [-h] [-j] [-s SENSOR] [-v]


smonctl: error: argument -s/--sensor: expected one argument


root@cumulus:~# smonctl


Fan1      (Fan Tray 1, Fan A                     ):  OK


Fan2      (Fan Tray 1, Fan B                     ):  OK


Fan3      (Fan Tray 2, Fan A                     ):  OK


Fan4      (Fan Tray 2, Fan B                     ):  OK


Fan5      (Fan Tray 3, Fan A                     ):  OK


Fan6      (Fan Tray 3, Fan B                     ):  OK


Fan7      (Fan Tray 4, Fan A                     ):  OK


Fan8      (Fan Tray 4, Fan B                     ):  OK


Fan9      (Fan Tray 5, Fan A                     ):  OK


Fan10     (Fan Tray 5, Fan B                     ):  OK


Fan11     (Fan Tray 6, Fan A                     ):  OK


Fan12     (Fan Tray 6, Fan B                     ):  OK

Photo of Rory Browne

Rory Browne

  • 120 Points 100 badge 2x thumb

Posted 2 weeks ago

  • 1
Photo of Scott Emery

Scott Emery, Official Rep

  • 980 Points 500 badge 2x thumb
Hello Rory,

In general we don't allow you to turn down the fans on a switch because they are running at that speed because that's the speed needed to keep the switch cool and not cause any permanent heat-related damage. You can imagine what havoc would be created if customers wanted a silent box and just turned the fans off.

You can see the fan speeds by including the "-v" option on smonctl. Here are the fan speeds on a system in our lab. Of course, your speeds will be different depending on the temperature readings on your system, but this should give you an idea of what "normal" fan speeds are:


$ smonctl -v
Fan1(Fan Tray 1, Fan A):  OK
fan:14900 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)

Fan2(Fan Tray 1, Fan B):  OK
fan:12500 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)

Fan3(Fan Tray 2, Fan A):  OK
fan:14900 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)

Fan4(Fan Tray 2, Fan B):  OK
fan:12300 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)

Fan5(Fan Tray 3, Fan A):  OK
fan:14800 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)

Fan6(Fan Tray 3, Fan B):  OK
fan:12400 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)

Fan7(Fan Tray 4, Fan A):  OK
fan:14700 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)

Fan8(Fan Tray 4, Fan B):  OK
fan:12300 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)

Fan9(Fan Tray 5, Fan A):  OK
fan:14800 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)

Fan10(Fan Tray 5, Fan B):  OK
fan:12400 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)

Fan11(Fan Tray 6, Fan A):  OK
fan:14800 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)

Fan12(Fan Tray 6, Fan B):  OK
fan:12400 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)

PSU1:  OK

PSU2:  BAD

PSU1Temp1(PSU1 Inlet Temp Sensor):  OK
temp:26.0 C (lcrit = 0 C, fan_max = 50 C, fan_min = 25 C, min = 5 C, max = 50 C, crit = 60 C)

PSU1Temp2(PSU1 Max Temp Sensor):  OK
temp:34.0 C (lcrit = 0 C, fan_max = 50 C, fan_min = 25 C, min = 5 C, max = 50 C, crit = 60 C)

PSU2Temp1(PSU2 Inlet Temp Sensor):  ABSENT

PSU2Temp2(PSU2 Max Temp Sensor):  ABSENT

Temp1(Temp sensor behind networking asic):  OK
temp:28.0 C (lcrit = 0 C, fan_max = 55 C, fan_min = 41 C, min = 5 C, max = 61 C, crit = 67 C)

Temp2(Temp sensor in front of networking asic):  OK
temp:26.5 C (lcrit = 0 C, fan_max = 60 C, fan_min = 46 C, min = 5 C, max = 66 C, crit = 69 C)

Temp3(Temp sensor front left):  OK
temp:25.5 C (lcrit = 0 C, fan_max = 50 C, fan_min = 32 C, min = 5 C, max = 56 C, crit = 59 C)

Temp4(Temp Sensor Near CPU):  OK
temp:23.5 C (lcrit = 0 C, fan_max = 50 C, fan_min = 33 C, min = 5 C, max = 57 C, crit = 60 C)

Temp5(Intel CPU die sensor):  OK
temp:22.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)

Temp6(Intel CPU die sensor):  OK
temp:22.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)

Temp7(Intel CPU die sensor):  OK
temp:23.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)

Temp8(Intel CPU die sensor):  OK
temp:23.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)

Temp9(Networking ASIC Die Temp Sensor):  OK
temp:38.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)

Temp10(Networking ASIC Die Temp Sensor):  OK
temp:38.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)

Temp11(Networking ASIC Die Temp Sensor):  OK
temp:39.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)

Temp12(Networking ASIC Die Temp Sensor):  OK
temp:38.4 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)

Temp13(Networking ASIC Die Temp Sensor):  OK
temp:40.4 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)

Temp14(Networking ASIC Die Temp Sensor):  OK
temp:42.8 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)

Temp15(Networking ASIC Die Temp Sensor):  OK
temp:39.4 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)

Temp16(Networking ASIC Die Temp Sensor):  OK
temp:38.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)

Messages:
PSU2:  status is installed, power_bad
Note that PSU2 is not plugged in on this system, which is why you see the power_bad for that supply.

Scott
Photo of Eric Pulvino

Eric Pulvino, Official Rep

  • 4,082 Points 4k badge 2x thumb
Modifying fan speeds is not recommended and definitely not supported. Most of the time those fan speeds are built with understanding of the optimal temperature values specified by the hardware manufacturer so if the fans are spinning quickly it is because they should be.

Is your environment very warm?
What does output from 'smonctl -v' show?
Is there a particular sensor that is reading significantly higher than others?
Is this happening on all switches of this model number?

Happy to work through this with you because there are options here.
Photo of Rory Browne

Rory Browne

  • 120 Points 100 badge 2x thumb

Thanks Eric, The environment is normal lab temperature. Here is the output. I realise that playing with Fan configs is not normal, but next to the edgecore I have an Arista 7060 - very similar switch in terms of hardware and its silent. I need to take the edgecore rack to a customer and I need to shout in order to be heard next to it.

@cumulus:~# smonctl -v
Fan1(Fan Tray 1, Fan A):  OK
fan:15000 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)

Fan2(Fan Tray 1, Fan B):  OK
fan:13800 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)

Fan3(Fan Tray 2, Fan A):  OK
fan:15200 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)

Fan4(Fan Tray 2, Fan B):  OK
fan:13800 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)

Fan5(Fan Tray 3, Fan A):  OK
fan:14700 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)

Fan6(Fan Tray 3, Fan B):  OK
fan:13400 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)

Fan7(Fan Tray 4, Fan A):  OK
fan:15100 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)

Fan8(Fan Tray 4, Fan B):  OK
fan:13800 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)

Fan9(Fan Tray 5, Fan A):  OK
fan:14600 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)

Fan10(Fan Tray 5, Fan B):  OK
fan:13500 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)

Fan11(Fan Tray 6, Fan A):  OK
fan:15100 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)

Fan12(Fan Tray 6, Fan B):  OK
fan:13800 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)

PSU1:  BAD

PSU2:  OK

PSU1Temp1(PSU1 Inlet Temp Sensor):  ABSENT

PSU1Temp2(PSU1 Max Temp Sensor):  ABSENT

PSU2Temp1(PSU2 Inlet Temp Sensor):  OK
temp:22.0 C (lcrit = 0 C, fan_max = 50 C, fan_min = 25 C, min = 5 C, max = 50 C,                                                                                                              crit = 60 C)

PSU2Temp2(PSU2 Max Temp Sensor):  OK
temp:32.0 C (lcrit = 0 C, fan_max = 50 C, fan_min = 25 C, min = 5 C, max = 50 C,                                                                                                              crit = 60 C)

Temp1(Temp sensor behind networking asic):  OK
temp:26.5 C (lcrit = 0 C, fan_max = 55 C, fan_min = 41 C, min = 5 C, max = 61 C,                                                                                                              crit = 67 C)

Temp2(Temp sensor in front of networking asic):  OK
temp:25.5 C (lcrit = 0 C, fan_max = 60 C, fan_min = 46 C, min = 5 C, max = 66 C,                                                                                                              crit = 69 C)

Temp3(Temp sensor front left):  OK
temp:23.5 C (lcrit = 0 C, fan_max = 50 C, fan_min = 32 C, min = 5 C, max = 56 C,                                                                                                              crit = 59 C)

Temp4(Temp Sensor Near CPU):  OK
temp:24.0 C (lcrit = 0 C, fan_max = 50 C, fan_min = 33 C, min = 5 C, max = 57 C,                                                                                                              crit = 60 C)

Temp5(Intel CPU die sensor):  OK
temp:23.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C,                                                                                                              crit = 98 C)

Temp6(Intel CPU die sensor):  OK
temp:23.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C,                                                                                                              crit = 98 C)

Temp7(Intel CPU die sensor):  OK
temp:23.0 C (lcrit = 0 C, f


Temp9(Networking ASIC Die Temp Sensor):  OK
temp:36.0 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C,                                                                                                              crit = 95 C)

Temp10(Networking ASIC Die Temp Sensor):  OK
temp:36.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C,                                                                                                              crit = 95 C)

Temp11(Networking ASIC Die Temp Sensor):  OK
temp:37.4 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C,                                                                                                              crit = 95 C)

Temp12(Networking ASIC Die Temp Sensor):  OK
temp:35.5 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C,                                                                                                              crit = 95 C)

Temp13(Networking ASIC Die Temp Sensor):  OK
temp:35.5 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C,                                                                                                              crit = 95 C)

Temp14(Networking ASIC Die Temp Sensor):  OK
temp:37.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C,                                                                                                              crit = 95 C)

Temp15(Networking ASIC Die Temp Sensor):  OK
temp:37.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C,                                                                                                              crit = 95 C)

Temp16(Networking ASIC Die Temp Sensor):  OK
temp:36.0 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C,                                                                                                              crit = 95 C)

Messages:
PSU1:  status is installed, power_bad


Photo of Eric Pulvino

Eric Pulvino, Official Rep

  • 4,082 Points 4k badge 2x thumb
Rory,
     Temperature readings from Temp sensor 7 and 8 are missing in the output above but all the rest are quite low. Some manufacturers require that the fan_min value be set to 50% of the max speed at the lowest value -- I'm not sure if that's the case here but looking at these values, picking on Fan1, the max value is 21300 RPM and the minimum value is 12000 RPM so even at minimum temperatures that fan is probably still not going to be quiet.

I'll let Scott Emery comment further as he is one of our platform engineers and is in a better position to provide more guidance. In the mean time could you post the temp values from sensors 7 and 8?

Thanks!
Photo of Rory Browne

Rory Browne

  • 120 Points 100 badge 2x thumb

Thanks Eric

emp7(Intel CPU die sensor):  OK
temp:24.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)

Temp8(Intel CPU die sensor):  OK
temp:24.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)


Photo of Rory Browne

Rory Browne

  • 120 Points 100 badge 2x thumb
Eric, Scott any ideas?
Photo of Eric Pulvino

Eric Pulvino, Official Rep

  • 4,082 Points 4k badge 2x thumb
Sending you an e-mail.