NOTICE: The Processors Wiki will End-of-Life on January 15, 2021. It is recommended to download any files or other content you may need that are hosted on processors.wiki.ti.com. The site is now set to read only.
Linux Core Power Management User's Guide (v3.14)
Contents
Power Management Introduction[edit]
Power management is a wide reaching topic and reducing the power a system uses is handled by a number of drivers and techniques. Power Management can broadly be classified into two categories: Dynamic/Active Power management and Idle Power Management. This page covers power topics for the v3.14 kernel. The most recent version of this guide can be found at Linux Core Power Management User's Guide, and a full history of this guide can be found at Linux Core Power Management User's Guide History.
Dynamic Power Management Techniques[edit]
Dynamic or active Power management techniques reduce the active power consumption by an SoC when the system is active and performing tasks.
- DVFS
- CPUIdle
- Smartreflex
Dynamic Voltage and Frequency Scaling(MPU aka CPUFREQ)[edit]
Dynamic voltage and frequency scaling, or DVFS as it is commonly known, is the ability of a part to modify both the voltage and frequency it operates at based on need, user preference, or other factors. MPU DVFS is supported in the kernel by the cpufreq driver. All supported SoCs use the generic cpufreq-cpu0 driver.
Design: OPP is a pair of voltage frequency value. When scaling from High OPP to Low OPP Frequency is reduced first and then the voltage. When scaling from a lower OPP to Higher OPP we scale the voltage first and then the frequency.
Release applicable[edit]
Latest release this documentation applies to is Kernel v3.14
Supported Devices[edit]
- OMAP5
- DRA7xx
- AM437x
- AM335x
Driver Features[edit]
Dynamic voltage and frequency scaling, or DVFS as it is commonly known, is the ability of a part to modify both the voltage and frequency it operates at based on need, user preference, or other factors. MPU DVFS is supported in the kernel by the cpufreq driver. All supported SoCs use the generic cpufreq-cpu0 driver. The frequency at which the MPU operates is selected by a driver called a governor. Each governor has a different strategy for selecting the most appropriate frequency. The following governors are available within the kernel:
- ondemand: This governor samples the load of the cpu and scales it up aggressively in order to provide the proper amount of processing power.
- conservative: This governor is similar to ondemand but uses a less aggressive method of increasing the the OPP of the MPU.
- performance: This governor statically sets the OPP of the MPU to the highest possible frequency.
- powersave: This governor statically sets the OPP of the MPU to the lowest possible frequency.
- userspace: This governor allows the user to set the desired OPP using any value found within scaling_available_frequencies by echoing it into scaling_setspeed.
More in depth documentation about each governor can be found in the linux kernel documentation here: https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt
By default, cpufreq, the cpufreq-cpu0 driver, and all of the standard governors are enabled with the ondemand governor selected as the default governor. To make changes, follow the instructions below.
Source Location[edit]
drivers/cpufreq/cpufreq-cpu0.c
Kernel Configuration Options[edit]
The driver can be built into the kernel as a static module, dynamic module, or both.
$ make menuconfig
Select CPU Power Management from the main menu.
... ... Boot options ---> CPU Power Management ---> Floating point emulation ---> ...
Select CPU Frequency Scaling as shown here:
... ... CPU Frequency Scaling ---> [*] CPU idle PM support ...
All relevant options are listed below:
[*] CPU Frequency scaling <*> CPU frequency translation statistics [*] CPU frequency translation statistics details Default CPUFreq governor (ondemand) ---> -*- 'performance' governor <*> 'powersave' governor <*> 'userspace' governor for userspace frequency scaling -*- 'ondemand' cpufreq policy governor <*> 'conservative' cpufreq governor <*> Generic CPU0 cpufreq driver ...
DT Configuration[edit]
The clock information and the operating-points table need to be added as given in the example below. The voltage source needs to be hooked to the cpu0 node. As given below cpu0-supply needs to be mapped to the right regulator node by looking at the schematics.
cpus { #address-cells = <1>; #size-cells = <0>; cpu@0 { compatible = "arm,cortex-a8"; device_type = "cpu"; reg = <0>; voltage-tolerance = <2>; /* 2 percentage */ clocks = <&dpll_mpu_ck>; clock-names = "cpu"; clock-latency = <300000>; /* From omap-cpufreq driver */ }; }; cpus { cpu@0 { cpu0-supply = <&dcdc2>; }; };
Only the frequency entries must be the same. To implement Dynamic Frequency Scaling (DFS), the voltages in the table can be changed to the same fixed value to avoid any voltage scaling from taking place if the system has been designed to use a single voltage.
For AM335x and AM437x it is also important to make sure that every frequency has a corresponding entry in the opp-modifier table found in the same device tree file. More information can be found in the OPP Modifier section of this page.
Driver Usage[edit]
All of the standard governors are built-in to the kernel, and by default the ondemand governor is selected.
To view available governors,
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors conservative userspace powersave ondemand performance
To view current governor,
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor ondemand
To set a governor,
$ echo userspace > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
To view current OPP (frequency in kHz)
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq 720000
To view supported OPP's (frequency in kHz),
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies 275000 500000 600000 720000
To change OPP (can be done only for userspace governor. If governors like ondemand is used, OPP change happens automatically based on the system load)
$ echo 275000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
CPUIdle[edit]
The cpuidle framework consists of two key components:
A governor that decides the target C-state of the system. A driver that implements the functions to transition to target C-state. The idle loop is executed when the Linux scheduler has no thread to run. When the idle loop is executed, current 'governor' is called to decide the target C-state. Governor decides whether to continue in current state/ transition to a different state. Current 'driver' is called to transition to the selected state.
Release applicable[edit]
Latest release this documentation applies to is Kernel v3.12
Supported Devices[edit]
- AM335x
- AM437x
Driver Features[edit]
AM335x supports two different C-states
- MPU WFI
- MPU WFI + Clockdomain gating
AM437x supports different C-states
- MPU WFI
- MPU WFI + Clockdomain gating
Source Location[edit]
arch/arm/mach-omap2/cpuidle33xx.c arch/arm/mach-omap2/cpuidle43xx.c
Kernel Configuration Options[edit]
The driver can be built into the kernel as a static module.
$ make menuconfig
Select CPU Power Management from the main menu.
... ... Boot options ---> CPU Power Management ---> Floating point emulation ---> ...
Select CPU Idle as shown here:
... ... CPU Frequency Scaling ---> CPU Idle ---> ...
All relevant options are listed below:
[*] CPU idle PM support [ ] Support multiple cpuidle drivers [*] Ladder governor (for periodic timer tick) -*- Menu governor (for tickless system) ARM CPU Idle Drivers ----
DT Configuration[edit]
There is no configuration required within the device tree for CPUIdle to function properly.
Driver Usage[edit]
CPUIdle requires no intervention by the user for it to work, it just works transparently in the background. By default the ladder governor is selected.
It is possible to get statistics about the different C-states during runtime, such as how long each state is occupied.
# ls -l /sys/devices/system/cpu/cpu0/cpuidle/state0/ -r--r--r-- 1 root root 4096 Jan 1 00:02 desc -r--r--r-- 1 root root 4096 Jan 1 00:02 latency -r--r--r-- 1 root root 4096 Jan 1 00:02 name -r--r--r-- 1 root root 4096 Jan 1 00:02 power -r--r--r-- 1 root root 4096 Jan 1 00:02 time -r--r--r-- 1 root root 4096 Jan 1 00:02 usage # ls -l /sys/devices/system/cpu/cpu0/cpuidle/state1/ -r--r--r-- 1 root root 4096 Jan 1 00:05 desc -r--r--r-- 1 root root 4096 Jan 1 00:05 latency -r--r--r-- 1 root root 4096 Jan 1 00:03 name -r--r--r-- 1 root root 4096 Jan 1 00:05 power -r--r--r-- 1 root root 4096 Jan 1 00:05 time -r--r--r-- 1 root root 4096 Jan 1 00:02 usage
Smartreflex[edit]
In other words Adaptive Voltage Scaling(AVS) is an active PM Technique and is based on the silicon type. Strong or weak devices. Hot devices are devices which can run at a particular frequency at a voltage lesser than the nominal voltage. Weak ones require higher voltage. There are various classes of AVS.
Idle Power Management Techniques[edit]
This ensures the system is drawing minimum power when in idle state i.e no use-case is running. This is accomplished by turning off as many peripherals as that are not in use.
Suspend/Resume Support[edit]
The user can deliberately force the system to low power state. There are various levels: Suspend to memory(RAM), Suspend to disk, etc. Certains parts support different levels of idle, such as DeepSleep0 or standby, which allow additional wake-up sources to be used with less wake latency at the expense of less power savings.
Release applicable[edit]
Latest release this documentation applies to is Kernel v3.14.
Supported Devices[edit]
- OMAP5
- DRA7xx
- AM437x
- AM335x
Driver Features[edit]
This is dependent on which device is in use. More information can be found in the device specific usage sections below.
Source Location[edit]
The files that provide suspend/resume differ from part to part however they generally reside in arch/arm/mach-omap2/pm****.c for the higher-level code and arch/arm/mach-omap2/sleep****.S for the lower-level code.
Kernel Configuration Options[edit]
Suspend/resume can be enable or disabled within the kernel using the same method for all parts. To configure suspend/resume, enter the kernel configuration tool using:
$ make menuconfig
Select Power management options from the main menu.
... ... Kernel Features ---> Boot options ---> CPU Power Management ---> Floating point emulation ---> Userspace binary formats ---> Power management options ---> [*] Networking support ---> Device Drivers ---> ... ...
Select Suspend to RAM and standby to toggle the power management support.
[*] Suspend to RAM and standby -*- Run-time PM core functionality ... < > Advanced Power Management Emulation
And then build the kernel as usual.
Power Management Usage[edit]
Although the techniques and concepts involved with power management are common across many platforms, the actual implementation and usage of each differ from part to part. The following sections cover the specifics of using the aforementioned power management techniques for each part that is supported by this release.
Common Power Management[edit]
IO Pad Configuration[edit]
In order to optimize power on the I/O supply rails, each pin can be given a "sleep" configuration in addition to it's run-time configuration. This can be handled with the pinctrl states defined in the board device tree for each peripheral. These values are used to configure the PAD_CONF registers found in the control module of the device which allow for selection of the MUXMODE of the pin and the operation of the internal pull resistor. Typically a device defines it's pinctrl state for normal operation:
davinci_mdio_default: davinci_mdio_default { pinctrl-single,pins = < /* MDIO */ 0x148 (PIN_INPUT_PULLUP | SLEWCTRL_FAST | MUX_MODE0) /* mdio_data.mdio_data */ 0x14c (PIN_OUTPUT_PULLUP | MUX_MODE0) /* mdio_clk.mdio_clk */ >; };
In order to define a sleep state for the same device, another pinctrl state can be defined:
davinci_mdio_sleep: davinci_mdio_sleep { pinctrl-single,pins = < /* MDIO reset value */ 0x148 (PIN_INPUT_PULLDOWN | MUX_MODE7) 0x14c (PIN_INPUT_PULLDOWN | MUX_MODE7) >; };
The driver then defines the sleep state in addition to the default state:
&davinci_mdio { pinctrl-names = "default", "sleep"; pinctrl-0 = <&davinci_mdio_default>; pinctrl-1 = <&davinci_mdio_sleep>; ...
Although the driver core handles selection of the default state during the initial probe of the driver, some extra work may be needed within the driver to make sure the sleep state is selected during suspend and the default state is re-selected at resume time. This is accomplished by placing calls to pinctrl_pm_select_sleep_state
at the end of the suspend handler of the driver and pinctrl_pm_select_default_state
at the start of the resume handler. These functions will not cause failure if the driver cannot find a sleep state so even with them added the sleep state is still default. Some drivers rely on the default configuration of the pins without any need for a default pinctrl entry to be set but if a sleep state is added a default state must be added as well in order for the resume path to be able to properly reconfigure the pins. Most TI drivers included with the 3.12 release already have this done.
The required pinctrl states will differ from board to board; configuration of each pin is dependent on the specific use of the pin and what it is connected to. Generally the most desirable configuration is to have an internal pull-down and GPIO mode set which gives minimal leakage. However, in a case where there are external pull-ups connected to the line (like for I2C lines) it makes more sense to disable the pull on the pin. The pins are supplied by several different rails which are described in the data manual for the part in use. By measuring current draw on each of these rails during suspend it may be possible to fine tune the pin configuration for maximum power savings. The AM335x EVM has pinctrl sleep states defined for its peripheral and serves as a good example.
Even pins that are not in use and not connected to anything can still leak some power so it is important to consider these pins as well when implementing the pad configuration. This can be accomplished by defining a pinctrl state for unused pins and then assigning it directly the the pinctrl node itself in the board device tree so the state is configured during boot even though there is no specific driver for these pins:
&am43xx_pinmux { pinctrl-names = "default"; pinctrl-0 = <&unused_wireless>; ... unused_pins: unused_pins { pinctrl-single,pins = < 0x80 (PIN_INPUT_PULLDOWN | MUX_MODE7) /* gpmc_csn1.mmc1_clk */ ...
Power Management on AM335 and AM437[edit]
Because of the high level of overlap of power management techniques between the two parts, AM335 and AM437 are covered in the same section. The power management features enabled on AM335x are as follows:
- Suspend/Resume
- DeepSleep0 is supported with mem power state
- Standby is supported with standby power state
- MPU DVFS
- CPU-Idle
CM3 Firmware[edit]
A small ARM Cortex-M3 co-processor is present on these parts that helps the SoC to get to the lowest power mode. This processor requires firmware to be loaded from the kernel at run-time for all low-power features of the SoC to be enabled. The name of the binary file containing this firmware is am335x-pm-firmware.elf for both SoCs. The git repository containing the source and precompiled binaries of this file can be found here: https://git.ti.com/ti-cm3-pm-firmware/amx3-cm3/commits/ti-v3.14.y .
There are two options for loading the CM3 firmware. If using the CoreSDK, the firmware will be included in /lib/firmware and the root filesystem should handle loading it automatically. Placing any version of am335x-pm-firmware.elf
at this location will cause it to load automatically during boot. It is also possible to manually load the firmware by following the instructions below:
The final option is to build the binary directly into the kernel. Note that if the firmware binary is built into the kernel it cannot be loaded using the methods above and will be automatically loaded during boot. To accomplish this, first make sure you have placed am335x-pm-firmware.elf
under <KERNEL SOURCE>/firmware
. Then enter the kernel configuration by typing:
$ make menuconfig
Select Device Drivers from the main menu.
... ... Kernel Features ---> Boot options ---> CPU Power Management ---> Floating point emulation ---> Userspace binary formats ---> Power management options ---> [*] Networking support ---> Device Drivers ---> ... ...
Select Generic Driver Options
Generic Driver Options CBUS support ... ...
Configure the name of the PM firmware and the location as shown below
... -*- Userspace firmware loading support [*] Include in-kernel firmware blobs in the kernel binary (am335x-pm-firmware.elf) External firmware blobs to build into the kernel binary (firmware) Firmware blobs root directory
The CM3 firmware is needed for all idle low power modes on am335x and am437x and for cpuidle on am335x. During boot, if the CM3 firmware has been properly loaded, the following message will be displayed:
PM: CM3 Firmware Version = 0x190
Suspend/Resume[edit]
The LCPD release supports mem sleep and standby sleep. On both AM335 and AM437 mem sleep corresponds to DeepSleep0. The following wake sources are supported from DeepSleep0
- UART
- GPIO0
- Touchscreen (AM335x only)
To enter DeepSleep0 enter the following at the command line:
$ echo mem > /sys/power/state
From here, the system will enter DeepSleep0. At any point, triggering one of the aforementioned wake-up sources will cause the kernel to resume and the board to exit DeepSleep0. A successful suspend/resume cycle should look like this:
$ echo mem > /sys/power/state $ PM: Syncing filesystems ... done. $ Freezing user space processes ... (elapsed 0.007 seconds) done. $ Freezing remaining freezable tasks ... (elapsed 0.006 seconds) done. $ Suspending console(s) (use no_console_suspend to debug) $ PM: suspend of devices complete after 194.787 msecs $ PM: late suspend of devices complete after 14.477 msecs $ PM: noirq suspend of devices complete after 17.849 msecs $ Disabling non-boot CPUs ... $ PM: Successfully put all powerdomains to target state $ PM: Wakeup source UART $ PM: noirq resume of devices complete after 39.113 msecs $ PM: early resume of devices complete after 10.180 msecs $ net eth0: initializing cpsw version 1.12 (0) $ net eth0: phy found : id is : 0x4dd074 $ PM: resume of devices complete after 368.844 msecs $ Restarting tasks ... done $
It is also possible to enter standby sleep with the possibility to use additional wake sources and have a faster resume time while using slightly more power. To enter standby sleep, enter the following at the command line:
$ echo standby > /sys/power/state
A successful cycle through standby sleep should look the same as DeepSleep0.
In the event that a cycle fails, the following message will be present in the log:
$ PM: Could not transition all powerdomains to target state
This is usually due to clocks that have not properly been shut off within the PER powerdomain. Make sure that all clocks within CM_PER are properly shut off and try again.
RTC-Only and RTC+DDR Mode[edit]
The LCPD release also supports two RTC modes depending on what the specific hardware in use supports. RTC+DDR Mode is similar to the Suspend/Resume above but only supports wake by the Power Button present on the board or from an RTC ALARM2 Event. RTC-Only mode supports the same wake sources, however DDR context is not maintained so a wake event causes a cold boot.
RTC-Only mode is supported on:
- AM437x GP EVM
- AM437x SK EVM
RTC+DDR mode is supported on:
- AM437x GP EVM
- AM437x SK EVM
The first step in using either of the RTC modes is to enable off mode by typing the following at the command line:
$ echo 1 > /sys/kernel/debug/pm_debug/enable_off_mode
With off-mode enabled, a command to enter DeepSleep0 will now enter RTC-Only mode:
$ echo mem > /sys/power/state
this method of entry only supports Power button as the wake source.
To use the rtc as a wake source, after enabling off mode use the following command:
$ rtcwake -s <NUMBER OF SECONDS TO SLEEP> -d /dev/rtc0 -m mem
Whether or not your board enters RTC-Only mode or RTC+DDR mode depends on the regulator configuration and whether or not the regulator that supplies the DDR is configured to remain on during suspend. This is supported by the TPS65218 in use of the AM437x boards but not the TPS65217 or TPS65910 present on AM335x boards. By default, the AM437x boards are configured for RTC+DDR mode. This is enabled by the regulator-suspend-enable
flag present in arch/arm/mach-omap2/am437x-gp-evm.dts
inside the dcdc3
sub-node of the tps65218
pmic node, which is the dcdc regulator that supplies the DDR:
tps65218: tps65218@24 { reg = <0x24>; compatible = "ti,tps65218"; interrupts = <GIC_SPI 7 IRQ_TYPE_NONE>; /* NMIn */ interrupt-parent = <&gic>; interrupt-controller; #interrupt-cells = <2>; ... dcdc3: regulator-dcdc3 { compatible = "ti,tps65218-dcdc3"; regulator-name = "vdcdc3"; regulator-suspend-enable; regulator-min-microvolt = <1500000>; regulator-max-microvolt = <1500000>; regulator-boot-on; regulator-always-on; }; ... };
Without this flag the board will enter RTC-Only mode with off-mode enabled, leading the RTC or push-button wake source to act like a cold boot.
One more important thing to make sure of is that you are using the proper u-boot. A certain u-boot is required in order to support RTC+DDR mode otherwise the following message appears during boot of the kernel:
PM: bootloader does not support rtc-only!
When building u-boot, rather than using am43xx_evm_config
you must use am43xx_evm_rtconly_config
to support either RTC mode.
DDR3 VTT Regulator Toggling[edit]
Some boards using DDR3 have a VTT Regulator that must be shut off during suspend to further conserve power. An example of a board with this regulator is the AM335X EVM SK. On AM335x, GPIO0 remains powered during DS0 so it is possible to use this to toggle a pin to control the VTT regulator. This is handled by the wakeup M3 processor and gets defined inside the device node within the board device tree file.
&wkup_m3 { ti,needs-vtt-toggle; ti,vtt-gpio-pin = <7>; };
ti,needs-vtt-toggle
is used to indicate that the vtt regulator must be toggled and ti,vtt-gpio-pin
indicates which pin within GPIO0 is connected to the VTT regulator to control it.
Power Management on DRA7[edit]
The power management features enabled on DRA7 are as follows:
- Suspend/Resume
- MPU DVFS
- SmartReflex
DVFS[edit]
On J6 we use On-Demand governor which is load based DVFS. Based on the load, say if the load is more we opearate a particular VDD at the highest OPP and based on the decrease in the load we lower the OPPs.
J6 VDD_MPU supports only 2 Opps for now. OPP_TURBO is not yet enabled. J6 VDD_CORE has only one OPP which removes the possibility of DVFS on VDD_CORE. J6 GPU DVFS is TBD.
Supported OPPs:
/* kHz uV */ 1000000 1090000 1176000 1210000
SmartReflex[edit]
J6 uses Class 0. It is a very simple class of AVS. The SR compensated voltages for different OPPs of various Voltage domains are burnt in the EFUSE registers. So whenever a new OPP is set the SR compensate voltage value for that particular OPP is read from the EFUSE registers and set.
Software Support: AVS class 0 is supported on K3.12 Integration tree kernel. Regulator chaining approach is used. A dummy SR-Class0 regulator is created per Voltage Domain along with the real regulator.
On entering an OPP, the voltage value to be selected is no longer the traditional nominal voltage, but the voltage meant from the efuse offset encoded in millivolts. Each device will have it's own unique voltage for given OPP. Therefore, it is not possible to encode a range of voltage representing an OPP voltage.
DRA processors may be powered using various PMICs - I2C based ones such as TPS659039 or SPI / GPIO controlled ones as well.
cpufreq/devfreq driver which controls voltage and frequency pairs tradit used: cpufreq/devfreq --> PMIC regulator \-> clock framework This opens up a few issues: a) PMIC regulator is designed for platforms that maynot use SmartReflex based SoCs, encoding the efuse offsets into every possible PMIC regul driver is practically in-efficient. b) Voltage values are not known a-priori to be encoded into DTB as they device specific.
To simplify this, we introduce: cpufreq/devfreq --> SmartReflex Class 0 regulator --> PMIC regulator \-> clock framework
Class 0 Regulator has information of translating the "nominal voltage" i voltage value stored in efuse offset. Example encoding: uVolts mVolt --> stored as 16 bit hex value of mV 975000 975 --> 0x03CF 1075000 1075 --> 0x0433 1200000 1200 --> 0x04B0
[1] http://www.ti.com/lit/ds/sprt659/sprt659.pdf [2] http://www.ti.com/lit/wp/swpy015a/swpy015a.pdf
Idle Power Management[edit]
In case of Vayu as of now only Suspend to RAM is supported. USB has issues in waking up when is suspended hence suspend/resume feature only suspends the MPU subsystem alone and does not transistion the Core Domain. Core domain will idle only when USB idles which will mean USB will not be able to wake up. Hence only MPU is suspended and resumed currently.
Steps to Suspend:
To use UART as wake up source from suspend please sure that no_console_suspend is given in bootargs. This is because UART module wake up is broken and IO-Daisy wake up is not yet supported.
uart resume needs multiple things:
a) no_console_suspend in bootargs b) enable uart wakeup capability. echo enabled > /sys/devices/ocp.3/4806a000.serial/tty/ttyO0/power/wakeup c) echo mem > /sys/power/state
Cpuidle, RTC only mode and Suspend to disk are TBD.