nvidia-smi.1.pdf

(72 KB) Pobierz
nvidia−smi(1)
NVIDIA
nvidia−smi(1)
NAME
nvidia−smi − NVIDIA System Management Interface program
SYNOPSIS
nvidia-smi [OPTION1 [ARG1]] [OPTION2 [ARG2]] ...
DESCRIPTION
nvidia-smi (also NVSMI) provides monitoring and management capabilities for each of NVIDIA's Tesla,
Quadro, GRID and GeForce devices from Fermi and higher architecture families. GeForce Titan series
devices are supported for most functions with very limited information provided for the remainder of the
Geforce brand. NVSMI is a cross platform tool that supports all standard NVIDIA driver-supported Linux
distros, as well as 64bit versions of Windows starting with Windows Server 2008 R2. Metrics can be con-
sumed directly by users via stdout, or provided by file via CSV and XML formats for scripting purposes.
Note that much of the functionality of NVSMI is provided by the underlying NVML C-based library. See
the NVIDIA developer website link below for more information about NVML. NVML-based python bind-
ings are also available.
The output of NVSMI is not guaranteed to be backwards compatible. However, both NVML and the
Python bindings are backwards compatible, and should be the first choice when writing any tools that must
be maintained across NVIDIA driver releases.
NVML SDK:
http://developer.nvidia.com/nvidia-management-library-nvml/
Python bindings:
http://pypi.python.org/pypi/nvidia-ml-py/
OPTIONS
GENERAL OPTIONS
−h, −−help
Print usage information and exit.
SUMMARY OPTIONS
−L, −−list−gpus
List each of the NVIDIA GPUs in the system, along with their UUIDs.
QUERY OPTIONS
−q, −−query
Display GPU or Unit info. Displayed info includes all data listed in the (GPU
ATTRIBUTES)
or (UNIT
ATTRIBUTES)
sections of this document. Some devices and/or environments don't support all possible
information. Any unsupported data is indicated by a "N/A" in the output. By default information for all
available GPUs or Units is displayed. Use the
−i
option to restrict the output to a single GPU or Unit.
[plus optional]
−u, −−unit
Display Unit data instead of GPU data. Unit data is only available for NVIDIA S−class Tesla enclosures.
−i, −−id=ID
Display data for a single specified GPU or Unit. The specified id may be the GPU/Unit's 0−based index in
the natural enumeration returned by the driver, the GPU's board serial number, the GPU's UUID, or the
GPU's PCI bus ID (as domain:bus:device.function in hex). It is recommended that users desiring consis-
tency use either UUID or PCI bus ID, since device enumeration ordering is not guaranteed to be consistent
between reboots and board serial number might be shared between multiple GPUs on the same board.
nvidia−smi 358.02
2015/9/2
1
nvidia−smi(1)
NVIDIA
nvidia−smi(1)
−f FILE, −−filename=FILE
Redirect query output to the specified file in place of the default stdout. The specified file will be overwrit-
ten.
−x, −−xml−format
Produce XML output in place of the default human−readable format. Both GPU and Unit query outputs
conform to corresponding DTDs. These are available via the
−−dtd
flag.
−−dtd
Use with
−x.
Embed the DTD in the XML output.
−−debug=FILE
Produces an encrypted debug log for use in submission of bugs back to NVIDIA.
−d TYPE, −−display=TYPE
Display only selected information: MEMORY, UTILIZATION, ECC, TEMPERATURE, POWER,
CLOCK, COMPUTE, PIDS, PERFORMANCE, SUPPORTED_CLOCKS, PAGE_RETIREMENT,
ACCOUNTING Flags can be combined with comma e.g. "MEMORY,ECC". Sampling data with max,
min and avg is also returned for POWER, UTILIZATION and CLOCK display types. Doesn't work with
-u/--unit or -x/--xml-format flags.
−l SEC, −−loop=SEC
Continuously report query data at the specified interval, rather than the default of just once. The applica-
tion will sleep in−between queries. Note that on Linux ECC error or XID error events will print out during
the sleep period if the
-x
flag was not specified. Pressing Ctrl+C at any time will abort the loop, which will
otherwise run indefinitely. If no argument is specified for the
−l
form a default interval of 5 seconds is
used.
SELECTIVE QUERY OPTIONS
Allows the caller to pass an explicit list of properties to query.
[one of]
−−query−gpu=
Information about GPU. Pass comma separated list of properties you want to query.
−−query−gpu=pci.bus_id,persistence_mode. Call −−help−query−gpu for more info.
−−query−supported−clocks=
List of supported clocks. Call −−help−query−supported−clocks for more info.
−−query−compute−apps=
List of currently active compute processes. Call −−help−query−compute−apps for more info.
−−query−accounted−apps=
List of accounted compute processes. Call −−help−query−accounted−apps for more info.
−−query−retired−pages=
List of GPU device memory pages that have been retired. Call −−help−query−retired−pages for more info.
e.g.
nvidia−smi 358.02
2015/9/2
2
nvidia−smi(1)
NVIDIA
nvidia−smi(1)
[mandatory]
−−format=
Comma separated list of format options:
csv - comma separated values (MANDATORY)
noheader - skip first line with column headers
nounits - don’t print units for numerical values
[plus any of]
−i, −−id=ID
Display data for a single specified GPU. The specified id may be the GPU's 0−based index in the natural
enumeration returned by the driver, the GPU's board serial number, the GPU's UUID, or the GPU's PCI bus
ID (as domain:bus:device.function in hex). It is recommended that users desiring consistency use either
UUID or PCI bus ID, since device enumeration ordering is not guaranteed to be consistent between reboots
and board serial number might be shared between multiple GPUs on the same board.
−f FILE, −−filename=FILE
Redirect query output to the specified file in place of the default stdout. The specified file will be overwrit-
ten.
−l SEC, −−loop=SEC
Continuously report query data at the specified interval, rather than the default of just once. The applica-
tion will sleep in−between queries. Note that on Linux ECC error or XID error events will print out during
the sleep period if the
-x
flag was not specified. Pressing Ctrl+C at any time will abort the loop, which will
otherwise run indefinitely. If no argument is specified for the
−l
form a default interval of 5 seconds is
used.
−lms ms, −−loop−ms=ms
Same as −l,−−loop but in milliseconds.
DEVICE MODIFICATION OPTIONS
[any one of]
−pm, −−persistence−mode=MODE
Set the persistence mode for the target GPUs. See the (GPU
ATTRIBUTES)
section for a description of
persistence mode. Requires root. Will impact all GPUs unless a single GPU is specified using the
−i
argu-
ment. The effect of this operation is immediate. However, it does not persist across reboots. After each
reboot persistence mode will default to "Disabled". Available on Linux only.
−e, −−ecc−config=CONFIG
Set the ECC mode for the target GPUs. See the (GPU
ATTRIBUTES)
section for a description of ECC
mode. Requires root. Will impact all GPUs unless a single GPU is specified using the
−i
argument. This
setting takes effect after the next reboot and is persistent.
−p, −−reset−ecc−errors=TYPE
Reset the ECC error counters for the target GPUs. See the (GPU
ATTRIBUTES)
section for a description
of ECC error counter types. Available arguments are 0|VOLATILE or 1|AGGREGATE. Requires root.
Will impact all GPUs unless a single GPU is specified using the
−i
argument. The effect of this operation
is immediate.
nvidia−smi 358.02
2015/9/2
3
nvidia−smi(1)
NVIDIA
nvidia−smi(1)
−c, −−compute−mode=MODE
Set the compute mode for the target GPUs. See the (GPU
ATTRIBUTES)
section for a description of com-
pute mode. Requires root. Will impact all GPUs unless a single GPU is specified using the
−i
argument.
The effect of this operation is immediate. However, it does not persist across reboots. After each reboot
compute mode will reset to "DEFAULT".
−dm TYPE, −−driver−model=TYPE
−fdm TYPE, −−force−driver−model=TYPE
Enable or disable TCC driver model. For Windows only. Requires administrator privileges.
−dm
will fail
if a display is attached, but
−fdm
will force the driver model to change. Will impact all GPUs unless a sin-
gle GPU is specified using the
−i
argument. A reboot is required for the change to take place. See
Driver
Model
for more information on Windows driver models.
−−gom=MODE
Set GPU Operation Mode: 0/ALL_ON, 1/COMPUTE, 2/LOW_DP Supported on GK110 M-class and X-
class Tesla products from the Kepler family. Not supported on Quadro and Tesla C-class products.
LOW_DP and ALL_ON are the only modes supported on GeForce Titan devices. Requires administrator
privileges. See
GPU Operation Mode
for more information about GOM. GOM changes take effect after
reboot. The reboot requirement might be removed in the future. Compute only GOMs don’t support
WDDM (Windows Display Driver Model)
−r, −−gpu−reset
Trigger a reset of the GPU. Can be used to clear GPU HW and SW state in situations that would otherwise
require a machine reboot. Typically useful if a double bit ECC error has occurred. Requires
−i
switch to
target specific device. Requires root. There can't be any applications using this particular device (e.g.
CUDA application, graphics application like X server, monitoring application like other instance of nvidia-
smi). There also can't be any compute applications running on any other GPU in the system. Only on sup-
ported devices from Fermi and Kepler family running on Linux.
GPU reset is not guaranteed to work in all cases. It is not recommended for production environments at this
time. In some situations there may be HW components on the board that fail to revert back to an initial
state following the reset request. This is more likely to be seen on Fermi-generation products vs. Kepler,
and more likely to be seen if the reset is being performed on a hung GPU.
Following a reset, it is recommended that the health of the GPU be verified before further use. The nvidia-
healthmon tool is a good choice for this test. If the GPU is not healthy a complete reset should be insti-
gated by power cycling the node.
Visit
http://developer.nvidia.com/gpu-deployment-kit
to download the GDK and nvidia-healthmon.
−ac, −−applications−clocks=MEM_CLOCK,GRAPHICS_CLOCK
Specifies maximum <memory,graphics> clocks as a pair (e.g. 2000,800) that defines GPU’s speed while
running applications on a GPU. For Tesla devices from the Kepler+ family and Maxwell-based GeForce
Titan. Requires root unless restrictions are relaxed with the −acp command..
−rac, −−reset−applications−clocks
Resets the applications clocks to the default value. For Tesla devices from the Kepler+ family and Max-
well-based GeForce Titan. Requires root unless restrictions are relaxed with the −acp command.
−acp, −−applications−clocks−permission=MODE
Toggle whether applications clocks can be changed by all users or only by root. Available arguments are
0|UNRESTRICTED, 1|RESTRICTED. For Tesla devices from the Kepler+ family and Maxwell-based
nvidia−smi 358.02
2015/9/2
4
nvidia−smi(1)
NVIDIA
nvidia−smi(1)
GeForce Titan. Requires root.
−pl, −−power−limit=POWER_LIMIT
Specifies maximum power limit in watts. Accepts integer and floating point numbers. Only on supported
devices from Kepler family. Requires administrator privileges. Value needs to be between Min and Max
Power Limit as reported by nvidia-smi.
−am, −−accounting−mode=MODE
Enables or disables GPU Accounting. With GPU Accounting one can keep track of usage of resources
throughout lifespan of a single process. Only on supported devices from Kepler family. Requires adminis-
trator privileges. Available arguments are 0|DISABLED or 1|ENABLED.
−caa, −−clear−accounted−apps
Clears all processes accounted so far. Only on supported devices from Kepler family. Requires administra-
tor privileges.
−−auto−boost−default=MODE
Set the default auto boost policy to 0/DISABLED or 1/ENABLED, enforcing the change only after the last
boost client has exited. Only on certain Tesla devices from the Kepler+ family and Maxwell-based
GeForce devices. Requires root.
−−auto−boost−default−force=MODE
Set the default auto boost policy to 0/DISABLED or 1/ENABLED, enforcing the change immediately.
Only on certain Tesla devices from the Kepler+ family and Maxwell-based GeForce devices. Requires
root.
−−auto−boost−permission=MODE
Allow non-admin/root control over auto boost mode. Available arguments are 0|UNRESTRICTED,
1|RESTRICTED. Only on certain Tesla devices from the Kepler+ family and Maxwell-based GeForce
devices. Requires root.
[plus optional]
−i, −−id=ID
Modify a single specified GPU. The specified id may be the GPU/Unit's 0−based index in the natural enu-
meration returned by the driver, the GPU's board serial number, the GPU's UUID, or the GPU's PCI bus ID
(as domain:bus:device.function in hex). It is recommended that users desiring consistency use either UUID
or PCI bus ID, since device enumeration ordering is not guaranteed to be consistent between reboots and
board serial number might be shared between multiple GPUs on the same board.
UNIT MODIFICATION OPTIONS
−t, −−toggle−led=STATE
Set the LED indicator state on the front and back of the unit to the specified color. See the (UNIT
ATTRIBUTES)
section for a description of the LED states. Allowed colors are 0|GREEN and 1|AMBER.
Requires root.
[plus optional]
−i, −−id=ID
Modify a single specified Unit. The specified id is the Unit's 0-based index in the natural enumeration
returned by the driver.
nvidia−smi 358.02
2015/9/2
5
Zgłoś jeśli naruszono regulamin