Skip to content
This repository was archived by the owner on Jan 12, 2026. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/sources/_images/advisor_roofline_gen9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 9 additions & 3 deletions docs/sources/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,21 +7,27 @@ List of examples
.. literalinclude:: ../../examples/01-hello_dpnp.py
:language: python
:lines: 27-
:caption: Your first NumPy code running on GPU
:caption: **EXAMPLE 01:** Your first NumPy code running on GPU
:name: examples_01_hello_dpnp

.. literalinclude:: ../../examples/02-dpnp_device.py
:language: python
:lines: 27-
:caption: Select device type while creating array
:caption: **EXAMPLE 02:** Select device type while creating array
:name: examples_02_dpnp_device

.. literalinclude:: ../../examples/03-dpnp2numba-dpex.py
:language: python
:lines: 27-
:caption: Compile dpnp code with numba-dpex
:caption: **EXAMPLE 03:** Compile dpnp code with numba-dpex
:name: examples_03_dpnp2numba_dpex

.. literalinclude:: ../../examples/04-dpctl_device_query.py
:language: python
:lines: 27-
:caption: **EXAMPLE 04:** Get information about devices
:name: examples_04_dpctl_device_query

Benchmarks
**********

Expand Down
10 changes: 9 additions & 1 deletion docs/sources/prerequisites_and_installation.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
.. _prerequisites_and_installation:
.. include:: ./ext_links.txt

.. |copy| unicode:: U+000A9

.. |trade| unicode:: U+2122

Prerequisites and installation
==============================

Expand Down Expand Up @@ -31,7 +35,8 @@ Extensions for Python manually.
3. Data Parallel Extensions for Python
***************************************

You can skip this step if you already installed Intel® Distribution for Python or Intel® AI Analytics Toolkit.
You can skip this step if you already installed Intel |copy| Distribution for Python or Intel |copy| AI Analytics Toolkit.

The easiest way to install Data Parallel Extensions for Python is to install numba-dpex:

Conda: ``conda install numba-dpex``
Expand All @@ -40,3 +45,6 @@ Pip: ``pip install numba-dpex``

The above commands will install ``numba-dpex`` along with its dependencies, including ``dpnp``, ``dpctl``,
and required compiler runtimes and drivers.

.. WARNING::
Before installing with conda or pip it is strongly advised to update ``conda`` and ``pip`` to latest versions
20 changes: 14 additions & 6 deletions docs/sources/programming_dpep.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,12 @@ to execute your `Numpy*`_ script on GPU usually requires changing just a few lin
.. literalinclude:: ../../examples/01-hello_dpnp.py
:language: python
:lines: 27-
:caption: Your first NumPy code running on GPU
:caption: **EXAMPLE 01:** Your first NumPy code running on GPU
:name: ex_01_hello_dpnp


In this example ``np.asarray()`` creates an array on the default `SYCL*`_ device, which is ``"gpu"`` on systems
with integrated or discrete GPU (it is ``"cpu"`` on systems that do not have GPU).
with integrated or discrete GPU (it is ``"host"`` on systems that do not have GPU).
The queue associated with this array is now carried with ``x``, and ``np.sum(x)`` will derive it from ``x``,
and respective pre-compiled kernel implementing ``np.sum()`` will be submitted to that queue.
The result ``y`` will be allocated on the device 0-dimensional array associated with that queue too.
Expand All @@ -57,7 +57,7 @@ In the following example we create the array ``x`` on the GPU device, and perfor
.. literalinclude:: ../../examples/02-dpnp_device.py
:language: python
:lines: 27-
:caption: Select device type while creating array
:caption: **EXAMPLE 02:** Select device type while creating array
:name: ex_02_dpnp_device


Expand All @@ -73,7 +73,7 @@ It takes just a few lines to modify your CPU `Numba*`_ script to run on GPU.
.. literalinclude:: ../../examples/03-dpnp2numba-dpex.py
:language: python
:lines: 27-
:caption: Compile dpnp code with numba-dpex
:caption: **EXAMPLE 03:** Compile dpnp code with numba-dpex
:name: ex_03_dpnp2numba_dpex

In this example we implement a custom function ``sum_it()`` that takes an array input. We compile it with
Expand Down Expand Up @@ -104,6 +104,12 @@ there are some situations when you will need to use dpctl advanced capabilities:
Another frequent usage is the creation additional queues for the purpose of profiling or choosing an out-of-order
execution of offload kernels.

.. literalinclude:: ../../examples/04-dpctl_device_query.py
:language: python
:lines: 27-
:caption: **EXAMPLE 04:** Get information about devices
:name: ex_04_dpctl_device_query

2. **Cross-platform development using Python Array API standard.** If you’re a Python developer
programming Numpy-like codes and targeting different hardware vendors and different tensor implementations,
then going with `Python* Array API Standard`_ is a good choice for writing a portable Numpy-like code.
Expand Down Expand Up @@ -208,8 +214,10 @@ The next command generates the roof-line graph as a html file in the output dire

> advisor --report=roofline --gpu --project-dir=<output_dir> --report-output=<output_dir>/roofline_gpu.html

.. todo::
Insert high-resolution image illustrating Advisor html report
.. image:: ./_images/advisor_roofline_gen9.png
:width: 800px
:align: center
:alt: Advisor roofline analysis example on Gen9 integrated GPU

The above figure shows an example roof-line graph generated using Intel Advisor.
The X-axis in the graph represents arithmetic intensity and the Y-axis represents performance in GFLOPS.
Expand Down
7 changes: 5 additions & 2 deletions examples/01-hello_dpnp.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,10 @@
import dpnp as np

x = np.asarray([1, 2, 3])
print("Array x allocated on the device:", x.device)

y = np.sum(x)

print(y.shape) # Must be 0-dimensional array
print(y) # Expect 6
print("Result y is located on the device:", y.device) # The same device as x
print("Shape of y is:", y.shape) # 0-dimensional array
print("y=", y) # Expect 6
7 changes: 6 additions & 1 deletion examples/02-dpnp_device.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,5 +32,10 @@
except:
print("GPU device is not available")

print("Array x allocated on the device:", x.device)

y = np.sum(x)
print(y) # Expect 6

print("Result y is located on the device:", y.device) # The same device as x
print("Shape of y is:", y.shape) # 0-dimensional array
print("y=", y) # Expect 6
16 changes: 9 additions & 7 deletions examples/03-dpnp2numba-dpex.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,20 +25,22 @@
# *****************************************************************************

import dpnp as np
from numba_dpex import jit
from numba_dpex import njit

@njit(parallel=True, fastmath=True)
def sum_it(x):
def sum_it(x): # Device queue is inferred from x. The kernel is submitted to that queue
return np.sum(x)


x = None
x = np.empty(3)
try:
x = np.asarray([1, 2, 3], device="gpu")
except:
print("GPU device is not available")

y = sum_it(x)
print("Array x allocated on the device:", x.device)

y = np.sum(x)

print(y.shape) # Must be 0-dimensional array
print(y) # Expect 6
print("Result y is located on the device:", y.device) # The same device as x
print("Shape of y is:", y.shape) # 0-dimensional array
print("y=", y) # Expect 6
33 changes: 33 additions & 0 deletions examples/04-dpctl_device_query.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# *****************************************************************************
# Copyright (c) 2022, Intel Corporation All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
#
# Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
# OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
# WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
# OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
# EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# *****************************************************************************

import dpctl

print("Platform:", dpctl.lsplatform()) # Print platform information
print("GPU devices:", dpctl.get_devices(device_type="gpu")) # Get the list of all GPU devices
print("Number of GPU devices", dpctl.get_num_devices(device_type="gpu")) # Get the number of GPU devices
print("Has CPU devices?", dpctl.has_cpu_devices()) # Check if there are CPU devices
print("Has host device?", dpctl.has_host_device()) # Check if there is the host device