Zodiac Wang
  • Home
  • Categories
  • Tags
  • Archives

NumPy入门


NumPy入门笔记,NumPy官网这篇 Tutorial 的整理和补充

Table of Contents

  • 1  基础知识
    • 1.1  一个例子
    • 1.2  ndarray的构造
      • 1.2.1  基于可迭代对象构造
      • 1.2.2  内置构造函数
      • 1.2.3  按函数生成
    • 1.3  打印多维数组
    • 1.4  基本操作
    • 1.5  通用函数
    • 1.6  索引、切片和迭代
  • 2  形状操作
    • 2.1  改变ndarray的形状
    • 2.2  ndarray的组合
    • 2.3  拆分ndarray
  • 3  副本与视图
    • 3.1  完全无复制
    • 3.2  视图与浅复制
      • 3.2.1  基于切片行为产生的视图
      • 3.2.2  基于view函数产生的视图
    • 3.3  深复制
      • 3.3.1  基于高级索引产生的副本
      • 3.3.2  基于 copy函数产生的副本
      • 3.3.3  作为右值对其他ndarray进行赋值时,为深复制
      • 3.3.4  基于array函数的ndarray构造,为深复制
    • 3.4  函数与方法概览
  • 4  进阶内容
    • 4.1  广播规则
  • 5  高级索引及索引技巧
    • 5.1  使用数字序列进行索引
      • 5.1.1  单个序列作为索引
      • 5.1.2  多个序列作为索引
    • 5.2  使用布尔序列进行索引
      • 5.2.1  用一个和目标ndarray同样形状的布尔序列进行索引
      • 5.2.2  第二种场景更像前面提到的整数序列索引
    • 5.3  ix_
    • 5.4  使用字符串索引
  • 6  线性代数
    • 6.1  简单的线性代数运算
  • 7  技巧和提示
    • 7.1  “自动”改变形状
    • 7.2  向量组合
    • 7.3  直方图
  • 8  index
    • 8.1  numpy.random模块
    • 8.2  ndarray 转换
    • 8.3  形状操作
    • 8.4  元素索引与转换
    • 8.5  计算
  • 9  以下为补充内容
  • 10  函数理解
    • 10.1  numpy.apply_along_axis
    • 10.2  numpy.roll
    • 10.3  统计函数
    • 10.4  Matrix library
    • 10.5  numpy.random.choice
    • 10.6  array_split
    • 10.7  numpy.linalg.norm
    • 10.8  bincount
    • 10.9  unique
    • 10.10  hypot
    • 10.11  unravel_index
    • 10.12  NumPy 数据类型一览表
  • 11  特定任务
    • 11.1  ndarray导入与导出
      • 11.1.1  导出
        • 11.1.1.1  savetxt
        • 11.1.1.2  tofile
        • 11.1.1.3  np.save
      • 11.1.2  导入
        • 11.1.2.1  loadtxt
        • 11.1.2.2  genfromtxt
        • 11.1.2.3  fromfile
        • 11.1.2.4  np.load
    • 11.2  交换ndarray的轴[广义转置]
    • 11.3  NumPy向量化编程
      • 11.3.1  numpy.vectorize
    • 11.4  给ndarray增加新的维度
      • 11.4.1  使用 newaxis 或者 None
      • 11.4.2  使用reshape
    • 11.5  自定义 dtype
    • 11.6  获取ndarray中出现次数最多的元素
      • 11.6.1  1维
      • 11.6.2  多维
    • 11.7  ndarray转换为DataFrame(多维->2维)
      • 11.7.1  比较保守的转换
        • 11.7.1.1  补充:利用stack压缩列标签,转换为激进型
      • 11.7.2  比较激进的转换
  • 12  问题与分析
    • 12.1  numpy.sort与numpy.argsort
      • 12.1.1  numpy.sort
      • 12.1.2  numpy.argsort
    • 12.2  outer
    • 12.3  numpy.where,numpy.nonzero和numpy.argwhere
      • 12.3.1  numpy.where
        • 12.3.1.1  同时给定condition和x, y
        • 12.3.1.2  不给定x y只给定一个条件
      • 12.3.2  numpy.nonzero
      • 12.3.3  numpy.argwhere
    • 12.4  mgrid, ogrid, meshgrid, ndenumerate与indices
      • 12.4.1  mgrid
      • 12.4.2  ogrid
      • 12.4.3  meshgrid
      • 12.4.4  ndenumerate
      • 12.4.5  indices
    • 12.5  allclose与array_equal
    • 12.6  ndarray 和 matrix
    • 12.7  reshape后自动降维
    • 12.8  (n, )和(n, 1)的广播原则
    • 12.9  tile和repeat
    • 12.10  只有一个数字的ndarray
    • 12.11  NumPy性能对比
    • 12.12  NumPy交换数据和比较操作
      • 12.12.1  交换
      • 12.12.2  比较
    • 12.13  NumPy中的reshape操作
    • 12.14  索引(indexing)与高级索引(advanced indexing, fancy indexing)的区别及特殊情况分析
      • 12.14.1  切片与高级索引同时出现的场景
        • 12.14.1.1  高级索引项出现在第一维
        • 12.14.1.2  高级索引项出现在第一维之后的维度
      • 12.14.2  作为索引的单个整数(以及只有一个整数的列表)
      • 12.14.3  特殊的切片情形
        • 12.14.3.1  若切片维度少于原始ndarray维度,就算某些维度的索引只是单个数字,也很好理解其仍为切片
        • 12.14.3.2  但当切片维度等于原始ndarray时,分两种情况。
      • 12.14.4  其他需要特殊区分的场景
      • 12.14.5  无论是普通索引还是高级索引,作为左值对其赋值,都将对原ndarray产生影响
    • 12.15  获取指定行列交叉点上的数据
    • 12.16  ndarray的链式索引
      • 12.16.1  作为右值
      • 12.16.2  作为左值
    • 12.17  ndarray内存排布的深入理解
      • 12.17.1  引子
      • 12.17.2  内存排布
        • 12.17.2.1  可视化ndarray内存排布
      • 12.17.3  视图与副本
        • 12.17.3.1  直接与间接访问
        • 12.17.3.2  中间变量
      • 12.17.4  总结任务
      • 12.17.5  进阶任务
In [1]:
import numpy as np
import matplotlib.pyplot as plt

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

基础知识¶

NumPy中主要的数据格式叫做ndarray,或者叫array,顾名思义即多维数组。

一个例子¶

ndarray的一些属性

  • ndarray.ndim 维度
  • ndarray.shape 形状,比如3维数组可能的形状(3,3,2)
  • ndarray.size ndarray中的元素数
  • ndarray.dtype ndarray中元素的数据类型
  • ndarray.itemsize 每一个元素的字节数,等价于ndarray.dtype.itemsize
  • ndarray.data 实际存储 ndarray 内容的内存
  • ndarray.T ndarray的转置,一个视图(view, 类似于引用),下文统称视图.
  • ndarray.flat 返回ndarray的一维化迭代器,对此迭代器赋值将导致整个数组元素被覆盖,而ndarray.flatten() 返回一个一维化的ndarray副本
  • ndarray.real/imag 返回复数数组的实部/虚部数组
  • ndarray.nbytes 数组占用的字节数
  • ndarray.base 返回父 ndarray,如果该ndarray是其他 ndarray 的 view,则返回原始的 ndarray
  • ndarray.flags ndarray的基本信息,是否对底层数据具有所有权(即是否为其他ndarray的视图),是否可写入等等

arange

Return evenly spaced values within a given interval.

返回指定区间内均匀分布的1-D ndarray

numpy.arange([start, ]stop, [step, ]dtype=None)

reshape

Gives a new shape to an array without changing its data.

返回具有新形状的ndarray视图

numpy.reshape(a, newshape, order='C')[source]
In [2]:
arr = np.arange(12).reshape((3, 4))

arr.ndim
Out[2]:
2
In [3]:
arr.shape
Out[3]:
(3, 4)
In [4]:
arr.size
Out[4]:
12
In [5]:
arr.dtype
Out[5]:
dtype('int32')
In [6]:
arr.itemsize
Out[6]:
4
In [7]:
arr.data
Out[7]:
<memory at 0x00000209CE6FCB40>
In [8]:
arr.T
Out[8]:
array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])
In [9]:
arr.flat

for element in arr.flat:
    print(element)
Out[9]:
<numpy.flatiter at 0x209cd2340d0>
0
1
2
3
4
5
6
7
8
9
10
11
In [10]:
arr.real, arr.imag
Out[10]:
(array([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]]), array([[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]]))
In [11]:
arr.nbytes
Out[11]:
48
In [12]:
arr.base
Out[12]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
In [13]:
arr.flags
Out[13]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

对ndarray.flat赋值,导致后续元素全部受影响

In [14]:
arr = np.arange(60).reshape((3, 4, 5))
arr

arr.flat = [1, 2]
arr
Out[14]:
array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]],

       [[20, 21, 22, 23, 24],
        [25, 26, 27, 28, 29],
        [30, 31, 32, 33, 34],
        [35, 36, 37, 38, 39]],

       [[40, 41, 42, 43, 44],
        [45, 46, 47, 48, 49],
        [50, 51, 52, 53, 54],
        [55, 56, 57, 58, 59]]])
Out[14]:
array([[[1, 2, 1, 2, 1],
        [2, 1, 2, 1, 2],
        [1, 2, 1, 2, 1],
        [2, 1, 2, 1, 2]],

       [[1, 2, 1, 2, 1],
        [2, 1, 2, 1, 2],
        [1, 2, 1, 2, 1],
        [2, 1, 2, 1, 2]],

       [[1, 2, 1, 2, 1],
        [2, 1, 2, 1, 2],
        [1, 2, 1, 2, 1],
        [2, 1, 2, 1, 2]]])

ndarray的构造¶

基于可迭代对象构造¶

Create an array.

构造 ndarray

numpy.array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0)

基于 list 或 tuple等可迭代序列构造ndarray,NumPy会自动推断数据格式(dtype)

In [15]:
arr = np.array([2, 3, 4])
arr
arr.dtype
Out[15]:
array([2, 3, 4])
Out[15]:
dtype('int32')
In [16]:
arr = np.array([1.2, 3.5, 5.1])
arr.dtype
Out[16]:
dtype('float64')

不要使用多个位置参数调用np.array

np.array(1,2,3,4) # WRONG

In [17]:
np.array([1, 2, 3, 4])  # RIGHT
Out[17]:
array([1, 2, 3, 4])

array 自动将二维或多维序列转换为多维数组, array 函数只接受一个序列参数,整体输入即可。

np.array((1.5,2,3), (4,5,6)) # wrong
In [18]:
np.array([(1.5, 2, 3), (4, 5, 6)])
Out[18]:
array([[1.5, 2. , 3. ],
       [4. , 5. , 6. ]])

通过dtype参数指定数组元素的类型

In [19]:
np.array([[1, 2], [3, 4]], dtype=complex)
Out[19]:
array([[1.+0.j, 2.+0.j],
       [3.+0.j, 4.+0.j]])

内置构造函数¶

需要构造已知维度的ndarray时,NumPy提供了很多构造特定ndarray的方法,比如 ones,zeros 还有empty,empty创建的数组内所包含的数是随机的,取决于分配的内存块中当前的状态,默认的 dtype 是 float64。

numpy.zeros(shape, dtype=float, order='C')
numpy..ones(shape, dtype=None, order='C')
numpy.empty(shape, dtype=float, order='C')

NumPy 中改变数组大小的操作很慢,因为需要重新分配内存并将已有数据复制一遍

In [20]:
np.zeros((3, 4))
np.ones((2, 3, 4), dtype=np.int16)          # 指定 dtype
np.empty((2, 3))                           # 值不确定
Out[20]:
array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])
Out[20]:
array([[[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]]], dtype=int16)
Out[20]:
array([[1.5, 2. , 3. ],
       [4. , 5. , 6. ]])

Numpy 提供了一个类似于range的函数arange, 接受 start, stop, step 参数。

np.arange([start,] stop[, step,], dtype=None)
In [21]:
np.arange(10, 30, 5)
np.arange(0, 2, 0.3)
Out[21]:
array([10, 15, 20, 25])
Out[21]:
array([0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])

当arange接受float参数的时候,因为浮点数精度的问题,行为可能会比较诡异。

此时推荐使用 linspace 函数,linspace 有一个 endpoint 参数,决定区间是否包含第二个参数作为最后一个数据点,默认为 True

linspace

Return evenly spaced numbers over a specified interval.

返回指定区间内的均匀分布数值

numpy.linspace(
    ['start', 'stop', 'num=50', 'endpoint=True', 'retstep=False', 'dtype=None', 'axis=0'],
)
In [22]:
from numpy import pi

np.linspace(0, 2, 9)
np.linspace(0, 2*pi, 10)
np.sin(np.linspace(0, 2*pi, 10))
Out[22]:
array([0.  , 0.25, 0.5 , 0.75, 1.  , 1.25, 1.5 , 1.75, 2.  ])
Out[22]:
array([0.        , 0.6981317 , 1.3962634 , 2.0943951 , 2.7925268 ,
       3.4906585 , 4.1887902 , 4.88692191, 5.58505361, 6.28318531])
Out[22]:
array([ 0.00000000e+00,  6.42787610e-01,  9.84807753e-01,  8.66025404e-01,
        3.42020143e-01, -3.42020143e-01, -8.66025404e-01, -9.84807753e-01,
       -6.42787610e-01, -2.44929360e-16])

另见:

array, zeros, zeros_like, ones, ones_like, empty, empty_like, arange, linspace, numpy.random.rand, numpy.random.randn, fromfunction, fromfile

zeros_like

Return an array of zeros with the same shape and type as a given array.

构造和输入ndarray同形状的元素全为0的ndarray

numpy.zeros_like(a, dtype=None, order='K', subok=True)
In [23]:
arr = np.arange(12).reshape((3, 4))
arr

np.zeros_like(arr)
Out[23]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
Out[23]:
array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])

ones_like

Return an array of ones with the same shape and type as a given array.

构造和输入ndarray同形状的元素全为1的ndarray

numpy.ones_like(a, dtype=None, order='K', subok=True)

empty_like

Return a new array with the same shape and type as a given array.

构造和输入ndarray同形状的空ndarray

numpy.empty_like(prototype, dtype=None, order='K', subok=True)

rand

Random values in a given shape.

返回指定形状的随机数

numpy.random.rand(d0, d1, ..., dn)
In [24]:
np.random.rand(3, 4)
Out[24]:
array([[0.18692556, 0.12425864, 0.38854238, 0.33444117],
       [0.28436856, 0.73242097, 0.92433785, 0.57142291],
       [0.01159949, 0.05949372, 0.21421816, 0.57355643]])

randn

Return a sample (or samples) from the “standard normal” distribution.

指定形状的标准正态分布随机数

numpy.random.randn(d0, d1, ..., dn)
In [25]:
np.random.randn(3, 4)
Out[25]:
array([[ 0.25060812,  0.19705778, -0.07811055,  0.28026645],
       [-1.53577158,  0.26562853,  0.79261808,  0.70344017],
       [ 0.50744347,  0.12512166, -0.55941091,  0.32812377]])

fromfile

numpy.fromfile见fromfile

按函数生成¶

类似 Python 生成式或者迭代器

np.fromfunction(function, shape, **kwargs)

根据 shape 给定的维度,以坐标作为输入,经函数处理后得到输出

In [26]:
def f(x, y):
    return 10*x+y


np.fromfunction(f, (5, 4), dtype=int)  # 指定形状,生成
Out[26]:
array([[ 0,  1,  2,  3],
       [10, 11, 12, 13],
       [20, 21, 22, 23],
       [30, 31, 32, 33],
       [40, 41, 42, 43]])

打印多维数组¶

In [27]:
a = np.arange(6)                    # 1d array
print(a)

b = np.arange(12).reshape(4, 3)          # 2d array
print(b)

c = np.arange(24).reshape(2, 3, 4)         # 3d array
print(c)
[0 1 2 3 4 5]
[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]

如果数组过大,打印时会自动跳过中间部分。想要强制打印整个数组,可以使用 set_printoptions

np.set_printoptions(threshold=np.nan)

基本操作¶

对ndarray的操作,大部分是elementwise的,即按元素操作。之后产生一个新的ndarray.

注意,在 NumPy 中 * 是元素乘法,如果需要矩阵乘法操作,使用 dot 或者 @ 操作符

In [28]:
A = np.array([[1, 1], [0, 1]])
B = np.array([[2, 0], [3, 4]])

A*B                         # elementwise product
A.dot(B)                    # matrix product
np.dot(A, B)                # another matrix product
A@B
Out[28]:
array([[2, 0],
       [0, 4]])
Out[28]:
array([[5, 4],
       [3, 4]])
Out[28]:
array([[5, 4],
       [3, 4]])
Out[28]:
array([[5, 4],
       [3, 4]])

以上操作也可以用函数替代,比如

In [29]:
np.add(A, B)
Out[29]:
array([[3, 1],
       [3, 5]])

+= *= 之类的操作 会就地操作原数组,而不是产生一个新的数组,因为优先调用 iadd 等 inplace 方法。

In [30]:
a = np.ones((2, 3), dtype=int)
b = np.random.random((2, 3))
a *= 3
a


b += a
b

try:
    a += b                  # b is not automatically converted to integer type
except Exception as e:
    print(e)
Out[30]:
array([[3, 3, 3],
       [3, 3, 3]])
Out[30]:
array([[3.39390146, 3.4361519 , 3.23511957],
       [3.48940575, 3.49200153, 3.10309887]])
Cannot cast ufunc add output from dtype('float64') to dtype('int32') with casting rule 'same_kind'

当两个操作数精度不同时,NumPy 会自动采用较高精度,叫做 upcasting,所以 float 无法自动转换到 int

很多操作,比如求和已经作为ndarray的方法进行了实现。例如 sum min max 等。这些操作默认 axis=None,但是可以手动指定axis参数来对某一维进行操作。

In [31]:
arr = np.arange(12).reshape(3, 4)

arr
arr.sum(axis=0)                            # sum of each column
arr.min(axis=1)                            # min of each row
arr.cumsum(axis=1)                         # cumulative sum along each row
Out[31]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
Out[31]:
array([12, 15, 18, 21])
Out[31]:
array([0, 4, 8])
Out[31]:
array([[ 0,  1,  3,  6],
       [ 4,  9, 15, 22],
       [ 8, 17, 27, 38]], dtype=int32)

通用函数¶

NumPy 内置了很多函数比如 sin cos exp sqrt add,被称作 ufunc,NumPy 中这些 ufunc 大多是 elementwise 的,返回新的 ndarray

In [32]:
arr = np.arange(12).reshape(3, 4)

arr
np.exp(arr)
np.sqrt(arr)

arr_1 = np.ones(12).reshape(3, 4)
np.add(arr, arr_1)
Out[32]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
Out[32]:
array([[1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01],
       [5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03],
       [2.98095799e+03, 8.10308393e+03, 2.20264658e+04, 5.98741417e+04]])
Out[32]:
array([[0.        , 1.        , 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974, 2.64575131],
       [2.82842712, 3.        , 3.16227766, 3.31662479]])
Out[32]:
array([[ 1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.],
       [ 9., 10., 11., 12.]])

另见:

all, any, apply_along_axis, argmax, argmin, argsort, average, bincount, ceil, clip, conj, corrcoef, cov, cross, cumprod, cumsum, diff, dot, floor, inner, inv, lexsort, max, maximum, mean, median, min, minimum, nonzero, outer, prod, re, round, sort, std, sum, trace, transpose, var, vdot, vectorize, where

all

Test whether all array elements along a given axis evaluate to True.

检验给定维度的所有值是否都满足一定条件为 True

numpy.all(a, axis=None, out=None, keepdims=<no value>)[source]
In [33]:
np.all([1.0, np.nan])
Out[33]:
True

nan 值被认为是 True

any

Test whether any array element along a given axis evaluates to True.

检验给定维度是否有任一值满足一定条件为 True

numpy.any(a, axis=None, out=None, keepdims=<no value>)
In [34]:
np.any(np.nan)
Out[34]:
True

nan值为True

apply_along_axis见

apply_along_axis 见补充部分

argmax

Returns the indices of the maximum values along an axis.

指定维度最大值的“坐标”

numpy.argmax(a, axis=None, out=None)

注意:该函数返回的 坐标是 1-D的,即无论是否指定某个维度或整体寻找最大值,返回的“坐标”都是1-D的。

出现重复最大值时,返回第一次出现的“坐标”

In [35]:
arr = np.arange(12).reshape(3, 4)

np.argmax(arr)

np.argmax(arr, axis=0)

np.argmax(arr, axis=1)
Out[35]:
11
Out[35]:
array([2, 2, 2, 2], dtype=int64)
Out[35]:
array([3, 3, 3], dtype=int64)

argmin

Returns the indices of the minimum values along an axis.

numpy.argmin(a, axis=None, out=None)

与 argmax类似

argsort见补充部分

outer见补充部分

transpose见补充部分

vectorize见补充部分

where见补充部分

索引、切片和迭代¶

一维ndarray的索引、切片和迭代与 python list 没有什么区别。

多维ndarray的每一维度都有对应一个index,按序列形式传递,当有些维度的索引没有给出的时候,默认为:,即该维度全部元素。

In [36]:
arr = np.arange(12).reshape(3, 4)
arr
Out[36]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
In [37]:
arr[2, 3]

arr[0:5, 1]                       # each row in the second column of b

arr[:, 1]                        # equivalent to the previous example

arr[1:3, :]                      # each column in the second and third row of b

arr[-1]                                  # 等价于 b[-1,:]
Out[37]:
11
Out[37]:
array([1, 5, 9])
Out[37]:
array([1, 5, 9])
Out[37]:
array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
Out[37]:
array([ 8,  9, 10, 11])

x[i] 也可以写成 x[i, ...], ... 可以代表任意多个 : ,假设 x 是5维的,则

  • x[1,2,...] 等价于 x[1,2,:,:,:]
  • x[...,3] 等价于 x[:,:,:,:,3]
  • x[4,...,5,:] 等价于 x[4,:,:,5,:]

对多维数组的迭代,等价于对第一维度的迭代,如果要对整个ndarray的所有元素进行迭代,可以使用 ndarray.flat,返回一个原数组的一维化迭代器。

In [38]:
for row in arr:
    print(row)
[0 1 2 3]
[4 5 6 7]
[ 8  9 10 11]
In [39]:
for element in arr.flat:
    print(element)
0
1
2
3
4
5
6
7
8
9
10
11

另见:

Indexing, Indexing (reference), newaxis, ndenumerate, indices

newaxis见补充部分

ndenumerate见补充部分

indices见补充部分

形状操作¶

改变ndarray的形状¶

用到的主要函数 ravel, flattern, reshape, resize

ravel

Return a contiguous flattened array.

返回一维化视图

numpy.ravel(a, order='C')

参数 order 控制一维化的顺序,order='C'时从最外层维度开始一维化,如果为F在从最内层开始。

flatten

Return a copy of the array collapsed into one dimension.

返回一维化副本

ndarray.flatten(order='C')

ravel()返回的是原ndarray的视图,视图的数据改变会影响原ndarray,flatten() 返回一个一维化的副本

In [40]:
arr = np.arange(12).reshape((3, 4))
arr.ravel()  # 一维化视图
arr.ravel(order='F')  # F order
Out[40]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
Out[40]:
array([ 0,  4,  8,  1,  5,  9,  2,  6, 10,  3,  7, 11])
In [41]:
arr.reshape(4, 3)

arr.T
arr.T.shape
arr.shape
Out[41]:
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])
Out[41]:
array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])
Out[41]:
(4, 3)
Out[41]:
(3, 4)
In [42]:
arr.flatten()
arr.flatten().flags
Out[42]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
Out[42]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

ndarray.resize

Change shape and size of array in-place.

就地改变ndarray的形状(有reference检查)

ndarray.resize(new_shape, refcheck=True)

numpy.resize

Return a new array with the specified shape.

返回改变形状的副本(无reference检查)

numpy.resize(a, new_shape)[source]

resize 和 reshape有两处不同,其一,ndarray.resize会进行reference检查,而reshape不会;其二,resize可以将原ndarray改变包含更多或更少元素的形状,更少时直接丢弃部分元素即可,更多时重复原ndarray。

In [43]:
arr = np.array([[1, 2], [3, 4]], order='C')
c = arr
arr.resize((2, 2))  # 即使存在 reference,依然可以就地resize

try:
    arr.resize((2, 1))
except Exception as e:
    print(e)
cannot resize an array that references or is referenced
by another array in this way.
Use the np.resize function or refcheck=False

实际上,即使存在reference,只要不更改核心数据,也还是可以就地resize的。

另见:

ndarray.shape, reshape, resize, ravel

ndarray的组合¶

用到的主要函数 vstack, hstack, column_stack, concatenate

vstack

Stack arrays in sequence vertically (row wise).

垂直组合ndarray

numpy.vstack(tup)

hstack

Stack arrays in sequence horizontally (column wise).

水平组合ndarray

numpy.hstack(tup)

column_stack

Stack 1-D arrays as columns into a 2-D array.

将1-Dndarray作为列组合为2-Dndarray

numpy.column_stack(tup)

concatenate

Join a sequence of arrays along an existing axis.

按指定维度组合ndarray

numpy.concatenate((a1, a2, ...), axis=0, out=None)

直接使用concatenate就完事了,高效、可控

直接使用concatenate就完事了,高效、可控

直接使用concatenate就完事了,高效、可控

In [44]:
arr_a = np.arange(12).reshape(3, 4)
arr_a

arr_b = np.arange(12, 24).reshape(3, 4)
arr_b

np.vstack((arr_a, arr_b))

np.hstack((arr_a, arr_b))
Out[44]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
Out[44]:
array([[12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])
Out[44]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])
Out[44]:
array([[ 0,  1,  2,  3, 12, 13, 14, 15],
       [ 4,  5,  6,  7, 16, 17, 18, 19],
       [ 8,  9, 10, 11, 20, 21, 22, 23]])

仅对于2-D ndarrayscolumn_stack是和hstack一样的

In [45]:
np.column_stack((arr_a, arr_b))
Out[45]:
array([[ 0,  1,  2,  3, 12, 13, 14, 15],
       [ 4,  5,  6,  7, 16, 17, 18, 19],
       [ 8,  9, 10, 11, 20, 21, 22, 23]])
In [46]:
a = np.array([4., 2.])
b = np.array([3., 8.])
np.column_stack((a, b))     # returns a 2D array

np.hstack((a, b))           # the result is different
Out[46]:
array([[4., 3.],
       [2., 8.]])
Out[46]:
array([4., 2., 3., 8.])

实际上,hstack 按第二维度拼接,而 vstack 按第一维度拼接

In [47]:
arr_a = np.arange(24).reshape((2, 3, 4))
arr_a

arr_b = np.arange(24, 48).reshape((2, 3, 4))
arr_b

np.vstack((arr_a, arr_b)).shape

np.hstack((arr_a, arr_b)).shape
Out[47]:
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])
Out[47]:
array([[[24, 25, 26, 27],
        [28, 29, 30, 31],
        [32, 33, 34, 35]],

       [[36, 37, 38, 39],
        [40, 41, 42, 43],
        [44, 45, 46, 47]]])
Out[47]:
(4, 3, 4)
Out[47]:
(2, 6, 4)

concatenate 可以指定按哪个维度拼接,若指定 axis=None 则组合为1-D

In [48]:
arr_a = np.arange(12).reshape(3, 4)
arr_a

arr_b = np.arange(12, 24).reshape(3, 4)
arr_b

np.concatenate((arr_a, arr_b), axis=0)

np.concatenate((arr_a, arr_b), axis=1)

np.concatenate((arr_a, arr_b), axis=None)
Out[48]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
Out[48]:
array([[12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])
Out[48]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])
Out[48]:
array([[ 0,  1,  2,  3, 12, 13, 14, 15],
       [ 4,  5,  6,  7, 16, 17, 18, 19],
       [ 8,  9, 10, 11, 20, 21, 22, 23]])
Out[48]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23])

此外还有r_ 和 c_ 在构建ndarray时也很有用

In [49]:
np.r_[1:4, 0, 4]
Out[49]:
array([1, 2, 3, 0, 4])

另见:

hstack, vstack, column_stack, concatenate, c_, r_

拆分ndarray¶

用到的主要函数 hsplit, vsplit, split

hsplit

Split an array into multiple sub-arrays horizontally (column-wise).

水平拆分ndarray

numpy.hsplit(ary, indices_or_sections)

vsplit

Split an array into multiple sub-arrays vertically (row-wise).

垂直拆分ndarray

numpy.vsplit(ary, indices_or_sections)

split

Split an array into multiple sub-arrays.

按指定维度拆分ndarray

numpy.split(ary, indices_or_sections, axis=0)

其中indices_or_sections可以只给出拆分后的个数,也可用于精细拆分,比如给定 [2, 3] 则拆分如下

  • arr[:2]
  • arr[2:3]
  • arr[3:]

axis 指定沿着哪个维度拆分.

In [50]:
arr = np.arange(12).reshape(3, 4)
arr

np.hsplit(arr, 2)   # 分为2个

np.hsplit(arr, (1, 2))   # 第2和第3列后拆分
Out[50]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
Out[50]:
[array([[0, 1],
        [4, 5],
        [8, 9]]), array([[ 2,  3],
        [ 6,  7],
        [10, 11]])]
Out[50]:
[array([[0],
        [4],
        [8]]), array([[1],
        [5],
        [9]]), array([[ 2,  3],
        [ 6,  7],
        [10, 11]])]
In [51]:
np.split(arr, 3)
Out[51]:
[array([[0, 1, 2, 3]]), array([[4, 5, 6, 7]]), array([[ 8,  9, 10, 11]])]
In [52]:
np.split(arr, [2, 3])
Out[52]:
[array([[0, 1, 2, 3],
        [4, 5, 6, 7]]),
 array([[ 8,  9, 10, 11]]),
 array([], shape=(0, 4), dtype=int32)]
In [53]:
np.split(arr, 2, axis=1)
Out[53]:
[array([[0, 1],
        [4, 5],
        [8, 9]]), array([[ 2,  3],
        [ 6,  7],
        [10, 11]])]

副本与视图¶

NumPy中关于复制大概分三种情况:完全不复制、视图(view)或者叫浅复制(shadow copy)以及副本或者叫深复制(deep copy)

完全无复制¶

将ndarray作为右值对某个变量名进行赋值不会有底层数据的复制,只是将另一个变量名也绑定到同一个ndarray上;函数调用也是一样,按引用调用,不会复制nndarray对象的数据(底层数据以及shape,data type等边缘数据都不存在复制)

In [54]:
arr = np.arange(12)
s = arr  # 将s这一变量名绑定到arr上
s is arr  # s即为arr
Out[54]:
True
In [55]:
s.shape = 3, 4    # 改变s的形状当然也改变arr
arr.shape
Out[55]:
(3, 4)
In [56]:
# 函数调用传入的是类似“指针”的东西
def f(x):
    print(id(x))


id(arr)

f(arr)
Out[56]:
2241145493744
2241145493744

视图与浅复制¶

视图,即不同的ndarray共用同一块内存作为底层数据,虽然外部属性诸如形状、数据类型、strides等不一样,但其底层的数据内存块是一样的。其中作为base的ndarray对存储底层数据的内存块具有所有权,其他作为视图的ndarray都是在该内存块的基础上借用其数据。改变视图的形状等边缘属性不会影响原ndarray,但改变视图的核心数据会影响原ndarray的底层数据。

切片或者普通索引总是返回视图,但仅仅针对作为右值时而言。作为左值时对其进行赋值都将直接影响原ndarray,而不会有新的ndarray产生(无论是视图或副本)

切片或者普通索引总是返回视图,但仅仅针对作为右值时而言。作为左值时对其进行赋值都将直接影响原ndarray,而不会有新的ndarray产生(无论是视图或副本)

切片或者普通索引总是返回视图,但仅仅针对作为右值时而言。作为左值时对其进行赋值都将直接影响原ndarray,而不会有新的ndarray产生(无论是视图或副本)

实际上,只要新的ndarray可以通过调整shape dtype strides这三要素调整索引机制,在原始内存块中标记出新ndarray所需内存块,那么就NumPy就倾向于返回视图

关于普通索引与高级索引的区别,见补充内容相关部分。

关于ndarray的内存管理,见from-python-to-numpy

此外,使用view方法也能返回视图。

基于切片行为产生的视图¶

所谓切片,即对ndarray进行索引时,每个维度上的索引值都是类似start:stop:step这样格式的,比如arr[0:4:2, 1:5:1]或者arr[1, :-1]。

In [57]:
arr = np.ones((3, 4))

s = arr[0:4:2, 1:5:1]

s.flags.owndata  # s对底层数据没有所有权
s.base is (arr if arr.flags.owndata else arr.base)  # s为arr的视图
Out[57]:
False
Out[57]:
True
In [58]:
s.shape = 2, 3  # 改变s的形状不改变 arr 的形状
s

arr.shape
Out[58]:
array([[1., 1., 1.],
       [1., 1., 1.]])
Out[58]:
(3, 4)
In [59]:
s[0, 2] = 999  # 改变s的数据同时改变 arr 的数据
s

arr
Out[59]:
array([[  1.,   1., 999.],
       [  1.,   1.,   1.]])
Out[59]:
array([[  1.,   1.,   1., 999.],
       [  1.,   1.,   1.,   1.],
       [  1.,   1.,   1.,   1.]])

基于view函数产生的视图¶

view

New view of array with the same data.

返回原ndarray的视图,可通过这一函数调整对底层内存的索引方式(8个8byte或4个16byte)

ndarray.view(dtype=None, type=None)

参数dtype用于改变对底层内存块的看待方式,即多少个字节作为一个元素单元。参数type的可选值为np.ndarray或np.matrix,用于更改ndarray的属性。不含参数的view则只返回一个视图,不改变原ndarray的其他属性。

In [60]:
arr = np.ones((3, 4))
s = arr.view()

s.flags.owndata  # s对底层数据没有所有权
s.base is (arr if arr.flags.owndata else arr.base)  # s为arr的视图
Out[60]:
False
Out[60]:
True

view另一种更常用的情形是改变对底层内存块的认识方式

In [61]:
arr = np.ones((3, 4), dtype=np.int8)
arr

# 将改变s的元素值以及形状
s = arr.view(dtype=np.int16)
s
s.shape
Out[61]:
array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]], dtype=int8)
Out[61]:
array([[257, 257],
       [257, 257],
       [257, 257]], dtype=int16)
Out[61]:
(3, 2)
In [62]:
s.flags.owndata  # s对底层数据没有所有权
s.base is (arr if arr.flags.owndata else arr.base)  # s为arr的视图
Out[62]:
False
Out[62]:
True

深复制¶

首先明确一个概念,高级索引(advanced indexing, fancy indexing)。区别于上面的切片概念,当对ndarray进行索引时,只要某一维度上的索引值出现了诸如[1,2,4]或[True, True, False]之类的通过明确序列给定的索引(而不是1:5:2这类复合start:stop:step范式的索引)时,那么就算作高级索引。

实际上,因为一旦涉及到这种在某一维度上给定一个由明确序列给定的索引时(例如[1, 2, 4]),NumPy很难通过形状、数据类型、strides(可能还需要一个start_offset) 来描述新ndarray在原ndarray的底层数据内存块上的索引机制,因此必然需要将数据在新的内存块上复制一份。

使用高级索引时总是返回副本,即对原ndarray进行深复制,但仅仅针对作为右值时而言。作为左值时对其进行赋值都将直接影响原ndarray,而不会有新的ndarray产生(无论是视图或副本)

使用高级索引时总是返回副本,即对原ndarray进行深复制,但仅仅针对作为右值时而言。作为左值时对其进行赋值都将直接影响原ndarray,而不会有新的ndarray产生(无论是视图或副本)

使用高级索引时总是返回副本,即对原ndarray进行深复制,但仅仅针对作为右值时而言。作为左值时对其进行赋值都将直接影响原ndarray,而不会有新的ndarray产生(无论是视图或副本)

具体关于高级索引的用法,见下文高级索引技巧小节。

此外,使用copy函数可以对ndarray进行深复制;ndarray作为左值时,会从作为右值的ndarray复制底层数据;使用array函数构造ndarray时,也会对输入序列的数据进行复制。

基于高级索引产生的副本¶

关于具体的高级索引方法,参考后文。

In [63]:
arr = np.ones((3, 4))
s = arr[[0, 1, 2]]
s

s.flags.owndata  # s拥有自己的底层数据
s.base is (arr if arr.flags.owndata else arr.base)  # s为独立ndarray
Out[63]:
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])
Out[63]:
True
Out[63]:
False

基于 copy函数产生的副本¶

numpy.copy

Return an array copy of the given object.

返回ndarray的副本

numpy.copy(a, order='K')

ndarray.copy

Return a copy of the array.

返回副本

ndarray.copy(order='C')
In [64]:
arr = np.ones((3, 4))
s = arr.copy()

s.flags.owndata  # s拥有自己的底层数据
s.base is (arr if arr.flags.owndata else arr.base)  # s为独立ndarray
Out[64]:
True
Out[64]:
False
In [65]:
s[0, 0] = 9999  # 对s进行更改不影响arr
arr
Out[65]:
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

作为右值对其他ndarray进行赋值时,为深复制¶

In [66]:
arr = np.ones((3, 4))
s = np.zeros((4, 4))
s.flags.owndata
Out[66]:
True
In [67]:
arr[:2, :2] = s[:2, :2]
arr
arr.flags.owndata  # 深复制
Out[67]:
array([[0., 0., 1., 1.],
       [0., 0., 1., 1.],
       [1., 1., 1., 1.]])
Out[67]:
True

基于array函数的ndarray构造,为深复制¶

In [68]:
arr = np.ones((3, 4))

s = np.array(arr)
s.flags.owndata  # s拥有独立数据
Out[68]:
True
In [69]:
s[0, 0] = 999  # 不影响arr

arr
Out[69]:
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

函数与方法概览¶

  • Array Creation

      arange, array, copy, empty, empty_like, eye, fromfile, fromfunction, identity, linspace, logspace, mgrid, ogrid, ones, ones_like, r, zeros, zeros_like
  • Conversions

      ndarray.astype, atleast_1d, atleast_2d, atleast_3d, mat
  • Manipulations

      array_split, column_stack, concatenate, diagonal, dsplit, dstack, hsplit,
      hstack, ndarray.item, newaxis, ravel, repeat, reshape, resize, squeeze, swapaxes, take, transpose, vsplit, vstack
  • Questions

      all, any, nonzero, where
  • Ordering

      argmax, argmin, argsort, max, min, ptp, searchsorted, sort
  • Operations

      choose, compress, cumprod, cumsum, inner, ndarray.fill, imag, prod, put, putmask, real, sum
  • Basic Statistics

      cov, mean, std, var
  • Basic Linear Algebra

      cross, dot, outer, linalg.svd, vdot

进阶内容¶

广播规则¶

Broadcasting allows universal functions to deal in a meaningful way with inputs that do not have exactly the same shape.

用有意义的方式处理 shape 不统一的情况。

The first rule of broadcasting is that if all input arrays do not have the same number of dimensions, a “1” will be repeatedly prepended to the shapes of the smaller arrays until all the arrays have the same number of dimensions.

The second rule of broadcasting ensures that arrays with a size of 1 along a particular dimension act as if they had the size of the array with the largest shape along that dimension. The value of the array element is assumed to be the same along that dimension for the “broadcast” array.

After application of the broadcasting rules, the sizes of all arrays must match. More details can be found in Broadcasting.

高级索引及索引技巧¶

高级索引(advanced indexing, fancy indexing),这是一个与普通索引(indexing)相区别的概念。所谓高级索引,区别于上面的普通索引或切片概念,即对ndarray进行索引时,只要某一维度上的索引值出现了诸如[1,2,4]或[True, True, False]之类的通过明确序列给定的索引(而不是1:5:2这类复合start:stop:step范式的索引)时,那么就算作高级索引。

使用数字序列进行索引¶

将坐标分为两类:

  • 直觉坐标,每个元素都由(x,y,z,...)构成,即一个完整的“坐标”
  • 反直觉“坐标”,是一个列表,列表的每一部分对应一个维度上的所有坐标值,例如列表第一部分就是所有“坐标”中 x 值的集合)

假设数据点为3维,共有n个数据。

则直觉坐标类似于: $$\left[ \left( x_1,y_1,z_1 \right) ,\left( x_2,y_2,z_2 \right) ,\cdots ,\left( x_n,y_n,z_n \right) \right] $$ 反直觉“坐标”类似于, $$\left[ \left( x_1,x_2,\cdots ,x_n \right) ,\left( y_1,y_2,\cdots ,y_n \right) ,\left( z_1,z_2,\cdots ,z_n \right) \right] $$

本节的索引方法对应反直觉“坐标”,作为参数的每一部分都对应某个维度上的“坐标值”,而并非一个完整的“坐标”。

无论目标ndarray本身为多少维,关键区别在于作为索引的序列。对ndarray进行索引时,主要区别就是作为索引的是一个序列,还是多个序列 单个列表、多个列表、单个ndarray、多个ndarray都可以作为索引,而单个列表如果是嵌套结构,其最外层会被剥离,但单个ndarray尽管不止一维度,仍然被视作整体。

arr[np.array([[1, 2], [3, 4]])] # 单一序列 对arr第一维度进行索引
arr[np.array([1, 2]), np.array([3, 4])]  # 两个序列 对arr第一以及第二维度进行索引

这些序列从前往后对应目标ndarray的第一维度、第二维度……等等维度上的“坐标”,以此类推。

多个序列进行索引时,

这些ndarray的形状需要一样

这些ndarray的形状需要一样

这些ndarray的形状需要一样

索引后得到的ndarray,其前几维形状首先与作为索引的序列形状一样,之后会依据索引得到的形状在后面进行增补,例如对形状为(2,3,4)的ndarray,使用2个形状为(3,3)的ndarray进行索引,则得到的ndarray形状为(3,3,4) (首先基于(3,3)的基础形状,目标ndarray在对前两维进行索引后得到的结果为(4,)故最终结果增补为(3,3,4))

单个序列作为索引¶

In [70]:
arr = np.arange(12)**2
i = np.array([1, 1, 3, 8, 5])
arr[i]  # 对第一维度进行索引

j = np.array([[3, 4], [9, 7]])
arr[j]  # 仍然对第一维度进行索引
Out[70]:
array([ 1,  1,  9, 64, 25], dtype=int32)
Out[70]:
array([[ 9, 16],
       [81, 49]], dtype=int32)
In [71]:
palette = np.array([[0, 0, 0],                # black
                    [255, 0, 0],              # red
                    [0, 255, 0],              # green
                    [0, 0, 255],              # blue
                    [255, 255, 255]])       # white

image = np.array([[0, 1, 2, 0],           # each value corresponds to a color in the palette
                  [0, 3, 4, 0]])
palette[image]  # 单个序列索引
palette[image].shape  # (2,4)->(2,4,3)
Out[71]:
array([[[  0,   0,   0],
        [255,   0,   0],
        [  0, 255,   0],
        [  0,   0,   0]],

       [[  0,   0,   0],
        [  0,   0, 255],
        [255, 255, 255],
        [  0,   0,   0]]])
Out[71]:
(2, 4, 3)

多个序列作为索引¶

In [72]:
arr = np.arange(12).reshape(3, 4)
j = np.array([[1, 1], [1, 2]])

arr[j]  # 对第一维度索引
arr[j].shape

arr[j, j]  # 对前2个维度索引
arr[j, j].shape
Out[72]:
array([[[ 4,  5,  6,  7],
        [ 4,  5,  6,  7]],

       [[ 4,  5,  6,  7],
        [ 8,  9, 10, 11]]])
Out[72]:
(2, 2, 4)
Out[72]:
array([[ 5,  5],
       [ 5, 10]])
Out[72]:
(2, 2)
In [73]:
arr = np.arange(12).reshape(3, 4)
arr

i = np.array([[0, 1],
              [1, 2]])
j = np.array([[2, 1],
              [3, 3]])

arr[i, j]                                   # 双重序列选择
arr[[i, j]]                                # 最外层列表被剥离,结果同上
arr[(i, j)]
Out[73]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
Out[73]:
array([[ 2,  5],
       [ 7, 11]])
c:\users\twang\appdata\local\conda\conda\envs\py36\lib\site-packages\ipykernel_launcher.py:10: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  # Remove the CWD from sys.path while we load stuff.
Out[73]:
array([[ 2,  5],
       [ 7, 11]])
Out[73]:
array([[ 2,  5],
       [ 7, 11]])
In [74]:
arr[i, 2]
arr[:, j]
Out[74]:
array([[ 2,  6],
       [ 6, 10]])
Out[74]:
array([[[ 2,  1],
        [ 3,  3]],

       [[ 6,  5],
        [ 7,  7]],

       [[10,  9],
        [11, 11]]])
In [75]:
arr = np.arange(20).reshape(4, 5)

s = np.array([i, j])
arr[s]
arr[s].shape

arr[tuple(s)]     # 等价于 a[i,j],tuple(s) 将ndarray的最外层转换为tuple形式,所以被NumPy忽略
Out[75]:
array([[[[ 0,  1,  2,  3,  4],
         [ 5,  6,  7,  8,  9]],

        [[ 5,  6,  7,  8,  9],
         [10, 11, 12, 13, 14]]],


       [[[10, 11, 12, 13, 14],
         [ 5,  6,  7,  8,  9]],

        [[15, 16, 17, 18, 19],
         [15, 16, 17, 18, 19]]]])
Out[75]:
(2, 2, 2, 5)
Out[75]:
array([[ 2,  6],
       [ 8, 13]])

将 i,j组合为 ndarray则不进行剥离,从而相当于对第一维进行索引

In [76]:
arr = np.arange(25).reshape(5, 5)

arr[[[1, 2], [3, 4]]]  # 整数list会被剥离
arr[[[1, 2], [3, 4]], ]
arr[[1, 2], [3, 4]]
c:\users\twang\appdata\local\conda\conda\envs\py36\lib\site-packages\ipykernel_launcher.py:3: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  This is separate from the ipykernel package so we can avoid doing imports until
Out[76]:
array([ 8, 14])
Out[76]:
array([[[ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24]]])
Out[76]:
array([ 8, 14])

a[i,j] 与 a[[i, j]] 结果相同,此时i,j均为ndarray,可以理解为此时 NumPy 自动对最外层的[]嵌套进行剥离,无论是[list of ndarray]还是[list of integer list],最外层都会被剥离,可以理解为NumPy期望在更多的维度上进行索引,而不是将序列视为整体只在某一个维度上进行索引

关于对最外层列表的剥离,有一个特殊情况,即类似出现arr[[1,2,3]]的情况时,详见补充内容,索引与高级索引的区别。

一个利用 multi-index 进行数据选择的例子

In [77]:
time = np.linspace(20, 145, 5)                 # time scale
data = np.sin(np.arange(20)).reshape(5, 4)      # 4 time-dependent series
time

data

# index of the maxima for each series
ind = data.argmax(axis=0)
ind

time_max = time[ind]                       # times corresponding to the maxima

# => data[ind[0],0], data[ind[1],1]...
data_max = data[ind, range(data.shape[1])]

time_max

data_max

np.all(data_max == data.max(axis=0))
Out[77]:
array([ 20.  ,  51.25,  82.5 , 113.75, 145.  ])
Out[77]:
array([[ 0.        ,  0.84147098,  0.90929743,  0.14112001],
       [-0.7568025 , -0.95892427, -0.2794155 ,  0.6569866 ],
       [ 0.98935825,  0.41211849, -0.54402111, -0.99999021],
       [-0.53657292,  0.42016704,  0.99060736,  0.65028784],
       [-0.28790332, -0.96139749, -0.75098725,  0.14987721]])
Out[77]:
array([2, 0, 3, 1], dtype=int64)
Out[77]:
array([ 82.5 ,  20.  , 113.75,  51.25])
Out[77]:
array([0.98935825, 0.84147098, 0.99060736, 0.6569866 ])
Out[77]:
True

可以把 array indexed 数组作为赋值对象,但是如果 index 重复出现则以最后一次为准,最好别这么用

In [78]:
a = np.arange(5)
a[[0, 0, 2]] = [1, 2, 3]
a
Out[78]:
array([2, 1, 3, 3, 4])

array 在 python 中的 += 方法可能会出现意想不到的结果

In [79]:
a = np.arange(5)
a[[0, 0, 2]] += 1
a
Out[79]:
array([1, 1, 3, 3, 4])

虽然 index 0 出现了两次,但是只会增加一次,因为 a+=1 等同于 a = a + 1.

使用布尔序列进行索引¶

有两种典型的应用场景:

用一个和目标ndarray同样形状的布尔序列进行索引¶

这种 indexing 方法返回的是一个 1-D array,相当于过滤器,但返回的是类似视图的ndarray,对其赋值将影响原ndarray

In [80]:
arr = np.arange(12).reshape(3, 4)
mask = arr > 4
mask

arr[mask]
Out[80]:
array([[False, False, False, False],
       [False,  True,  True,  True],
       [ True,  True,  True,  True]])
Out[80]:
array([ 5,  6,  7,  8,  9, 10, 11])
In [81]:
arr[mask] = 0
arr
Out[81]:
array([[0, 1, 2, 3],
       [4, 0, 0, 0],
       [0, 0, 0, 0]])

一个产生 mandelbrot set 的例子

In [82]:
def mandelbrot(h, w, maxit=20):
    """Returns an image of the Mandelbrot fractal of size (h,w)."""
    y, x = np.ogrid[-1.4:1.4:h*1j, -2:0.8:w*1j]
    c = x+y*1j
    z = c
    divtime = maxit + np.zeros(z.shape, dtype=int)

    for i in range(maxit):
        z = z**2 + c
        diverge = z*np.conj(z) > 2**2            # who is diverging
        div_now = diverge & (divtime == maxit)  # who is diverging now
        divtime[div_now] = i                  # note when
        z[diverge] = 2                        # avoid diverging too much

    return divtime


plt.imshow(mandelbrot(400, 400))
plt.show()
Out[82]:
<matplotlib.image.AxesImage at 0x209d0b5c4e0>

第二种场景更像前面提到的整数序列索引¶

但是只能对单一维度进行选择,同时对多个维度进行选择会得到奇怪的结果

In [83]:
arr = np.arange(12).reshape(3, 4)
mask_1 = np.array([False, True, True])
mask_2 = np.array([True, False, True, False])

arr[mask_1, :]  # 选择行
arr[mask_1]  # 选择行
arr[:, mask_2]  # 选择列
arr[mask_1, mask_2]  # 奇怪结果
Out[83]:
array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
Out[83]:
array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
Out[83]:
array([[ 0,  2],
       [ 4,  6],
       [ 8, 10]])
Out[83]:
array([ 4, 10])

ix_¶

Construct an open mesh from multiple sequences.

返回从输入序列中各取一个的所有组合

numpy.ix_(*args)

比如计算 a+b*c

In [84]:
a = np.array([2, 3, 4, 5])
b = np.array([8, 5, 4])
c = np.array([5, 4, 6, 8, 3])
ax, bx, cx = np.ix_(a, b, c)

ax
bx
cx

ax.shape, bx.shape, cx.shape
Out[84]:
array([[[2]],

       [[3]],

       [[4]],

       [[5]]])
Out[84]:
array([[[8],
        [5],
        [4]]])
Out[84]:
array([[[5, 4, 6, 8, 3]]])
Out[84]:
((4, 1, 1), (1, 3, 1), (1, 1, 5))
In [85]:
result = ax+bx*cx
result
result.shape
Out[85]:
array([[[42, 34, 50, 66, 26],
        [27, 22, 32, 42, 17],
        [22, 18, 26, 34, 14]],

       [[43, 35, 51, 67, 27],
        [28, 23, 33, 43, 18],
        [23, 19, 27, 35, 15]],

       [[44, 36, 52, 68, 28],
        [29, 24, 34, 44, 19],
        [24, 20, 28, 36, 16]],

       [[45, 37, 53, 69, 29],
        [30, 25, 35, 45, 20],
        [25, 21, 29, 37, 17]]])
Out[85]:
(4, 3, 5)
In [86]:
result[3, 2, 4]
a[3]+b[2]*c[4]
Out[86]:
17
Out[86]:
17

利用上述性质实现 reduce

In [87]:
def ufunc_reduce(ufct, *vectors):
    vs = np.ix_(*vectors)
    r = ufct.identity
    for v in vs:
        r = ufct(r, v)
    return r


ufunc_reduce(np.add, a, b, c)
Out[87]:
array([[[15, 14, 16, 18, 13],
        [12, 11, 13, 15, 10],
        [11, 10, 12, 14,  9]],

       [[16, 15, 17, 19, 14],
        [13, 12, 14, 16, 11],
        [12, 11, 13, 15, 10]],

       [[17, 16, 18, 20, 15],
        [14, 13, 15, 17, 12],
        [13, 12, 14, 16, 11]],

       [[18, 17, 19, 21, 16],
        [15, 14, 16, 18, 13],
        [14, 13, 15, 17, 12]]])

此版本的 reduce 和 ufunc.reduce 相比的优点是利用 broadcasting rules 从而避免了中间变量的产生。

使用字符串索引¶

见Structured arrays

线性代数¶

简单的线性代数运算¶

更多内容可以参考numpy.linalg模块

In [88]:
arr = np.array([[1.0, 2.0], [3.0, 4.0]])
print(arr)

arr.transpose()  # 转置
np.linalg.inv(arr)  # 求逆
[[1. 2.]
 [3. 4.]]
Out[88]:
array([[1., 3.],
       [2., 4.]])
Out[88]:
array([[-2. ,  1. ],
       [ 1.5, -0.5]])
In [89]:
u = np.eye(2)  # 2x2 单位矩阵
u

j = np.array([[0.0, -1.0], [1.0, 0.0]])
j
Out[89]:
array([[1., 0.],
       [0., 1.]])
Out[89]:
array([[ 0., -1.],
       [ 1.,  0.]])
In [90]:
np.dot(j, j)  # 矩阵乘法
np.trace(u)  # 迹
Out[90]:
array([[-1.,  0.],
       [ 0., -1.]])
Out[90]:
2.0
In [91]:
y = np.array([[5.], [7.]])
np.linalg.solve(arr, y)  # 求解线性方程

np.linalg.eig(j)  # 计算特征值和特征向量
Out[91]:
array([[-3.],
       [ 4.]])
Out[91]:
(array([0.+1.j, 0.-1.j]),
 array([[0.70710678+0.j        , 0.70710678-0.j        ],
        [0.        -0.70710678j, 0.        +0.70710678j]]))

技巧和提示¶

“自动”改变形状¶

改变ndarray形状时,可以省略某一维度的值,NumPy会自动推断。

In [92]:
arr = np.arange(30)
arr.shape = 2, -1, 3

arr.shape
arr
Out[92]:
(2, 5, 3)
Out[92]:
array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8],
        [ 9, 10, 11],
        [12, 13, 14]],

       [[15, 16, 17],
        [18, 19, 20],
        [21, 22, 23],
        [24, 25, 26],
        [27, 28, 29]]])

向量组合¶

见ndarray的组合

直方图¶

histogram函数以ndarray为输入,输出一个hitogram向量和一个bin向量,hitogram对应区间计数,bin对应区间划分。

matplotlib中也有直方图函数hist和NumP中的主要区别是,hist自动画出热力图而numpy.histogram只是产生数据。

Compute the histogram of a set of data.

计算数据的直方图统计结果

numpy.histogram(a, bins=10, range=None, normed=None, weights=None, density=None)[source]
In [93]:
# 生成数据
mu, sigma = 2, 0.5
arr = np.random.normal(mu, sigma, 10000)
arr
Out[93]:
array([1.47284974, 1.31379922, 2.93450001, ..., 3.30305208, 2.03302393,
       1.88693229])
In [94]:
# Plot a normalized histogram with 50 bins
plt.hist(arr, bins=50, normed=1)
plt.show()
c:\users\twang\appdata\local\conda\conda\envs\py36\lib\site-packages\matplotlib\axes\_axes.py:6571: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.
  warnings.warn("The 'normed' kwarg is deprecated, and has been "
Out[94]:
(array([0.00116306, 0.00116306, 0.00232611, 0.00116306, 0.00232611,
        0.00697833, 0.00697833, 0.00465222, 0.01860889, 0.03023944,
        0.0523375 , 0.07443555, 0.10467499, 0.14654499, 0.16515387,
        0.26401359, 0.33844914, 0.42567829, 0.53965773, 0.56408189,
        0.65480022, 0.73156188, 0.80716048, 0.8013452 , 0.74319243,
        0.75947521, 0.7676166 , 0.67108299, 0.59897356, 0.56524495,
        0.42102607, 0.36519941, 0.28145942, 0.20120859, 0.15817554,
        0.1279361 , 0.07327249, 0.06164194, 0.03372861, 0.01744583,
        0.01511972, 0.00697833, 0.00581528, 0.00232611, 0.00232611,
        0.00348917, 0.00116306, 0.        , 0.        , 0.00116306]),
 array([-0.07747034,  0.00851008,  0.0944905 ,  0.18047093,  0.26645135,
         0.35243178,  0.4384122 ,  0.52439262,  0.61037305,  0.69635347,
         0.78233389,  0.86831432,  0.95429474,  1.04027516,  1.12625559,
         1.21223601,  1.29821643,  1.38419686,  1.47017728,  1.55615771,
         1.64213813,  1.72811855,  1.81409898,  1.9000794 ,  1.98605982,
         2.07204025,  2.15802067,  2.24400109,  2.32998152,  2.41596194,
         2.50194236,  2.58792279,  2.67390321,  2.75988364,  2.84586406,
         2.93184448,  3.01782491,  3.10380533,  3.18978575,  3.27576618,
         3.3617466 ,  3.44772702,  3.53370745,  3.61968787,  3.70566829,
         3.79164872,  3.87762914,  3.96360957,  4.04958999,  4.13557041,
         4.22155084]),
 <a list of 50 Patch objects>)
In [95]:
# 通过NumPy计算
n, bins = np.histogram(arr, bins=50, normed=True)
n
bins
c:\users\twang\appdata\local\conda\conda\envs\py36\lib\site-packages\ipykernel_launcher.py:2: VisibleDeprecationWarning: Passing `normed=True` on non-uniform bins has always been broken, and computes neither the probability density function nor the probability mass function. The result is only correct if the bins are uniform, when density=True will produce the same result anyway. The argument will be removed in a future version of numpy.
  
Out[95]:
array([0.00116306, 0.00116306, 0.00232611, 0.00116306, 0.00232611,
       0.00697833, 0.00697833, 0.00465222, 0.01860889, 0.03023944,
       0.0523375 , 0.07443555, 0.10467499, 0.14654499, 0.16515387,
       0.26401359, 0.33844914, 0.42567829, 0.53965773, 0.56408189,
       0.65480022, 0.73156188, 0.80716048, 0.8013452 , 0.74319243,
       0.75947521, 0.7676166 , 0.67108299, 0.59897356, 0.56524495,
       0.42102607, 0.36519941, 0.28145942, 0.20120859, 0.15817554,
       0.1279361 , 0.07327249, 0.06164194, 0.03372861, 0.01744583,
       0.01511972, 0.00697833, 0.00581528, 0.00232611, 0.00232611,
       0.00348917, 0.00116306, 0.        , 0.        , 0.00116306])
Out[95]:
array([-0.07747034,  0.00851008,  0.0944905 ,  0.18047093,  0.26645135,
        0.35243178,  0.4384122 ,  0.52439262,  0.61037305,  0.69635347,
        0.78233389,  0.86831432,  0.95429474,  1.04027516,  1.12625559,
        1.21223601,  1.29821643,  1.38419686,  1.47017728,  1.55615771,
        1.64213813,  1.72811855,  1.81409898,  1.9000794 ,  1.98605982,
        2.07204025,  2.15802067,  2.24400109,  2.32998152,  2.41596194,
        2.50194236,  2.58792279,  2.67390321,  2.75988364,  2.84586406,
        2.93184448,  3.01782491,  3.10380533,  3.18978575,  3.27576618,
        3.3617466 ,  3.44772702,  3.53370745,  3.61968787,  3.70566829,
        3.79164872,  3.87762914,  3.96360957,  4.04958999,  4.13557041,
        4.22155084])
In [96]:
# matplot 绘图
plt.plot(.5*(bins[1:]+bins[:-1]), n)  # 在中点处可视化计数值
plt.show()
Out[96]:
[<matplotlib.lines.Line2D at 0x209d0d71748>]

index¶

numpy.random模块¶

一些常用的 random 函数

rand(d0, d1, ..., dn)               Random values in a given shape.Create an array of the given shape and populate it with random samples from a uniform distribution over [0, 1)
randn(d0, d1, ..., dn)              Return a sample (or samples) from the “standard normal” distribution.
randint(low[, high, size, dtype])   Return random integers from low (inclusive) to high (exclusive).
random_integers(low[, high, size])  Random integers of type np.int between low and high, inclusive.
random_sample([size])               Return random floats in the half-open interval [0.0, 1.0).
random([size])                      Return random floats in the half-open interval [0.0, 1.0).
ranf([size])                        Return random floats in the half-open interval [0.0, 1.0).
sample([size])                      Return random floats in the half-open interval [0.0, 1.0).
choice(a[, size, replace, p])       Generates a random sample from a given 1-D array
bytes(length)                       Return random bytes.

ndarray 转换¶

ndarray.item(*args) Copy an element of an array to a standard Python scalar and return it.
ndarray.tolist()    Return the array as a (possibly nested) list.
ndarray.itemset(*args)  Insert scalar into an array (scalar is cast to array’s dtype, if possible)
ndarray.tostring([order])   Construct Python bytes containing the raw data bytes in the array.
ndarray.tobytes([order])    Construct Python bytes containing the raw data bytes in the array.
ndarray.tofile(fid[, sep, format])  Write array to a file as text or binary (default).
ndarray.dump(file)  Dump a pickle of the array to the specified file.
ndarray.dumps() Returns the pickle of the array as a string.
ndarray.astype(dtype[, order, casting, ...])    Copy of the array, cast to a specified type.
ndarray.byteswap(inplace)   Swap the bytes of the array elements
ndarray.copy([order])   Return a copy of the array.
ndarray.view([dtype, type]) New view of array with the same data.
ndarray.getfield(dtype[, offset])   Returns a field of the given array as a certain type.
ndarray.setflags([write, align, uic])   Set array flags WRITEABLE, ALIGNED, and UPDATEIFCOPY, respectively.
ndarray.fill(value) Fill the array with a scalar value.

形状操作¶

ndarray.reshape(shape[, order]) Returns an array containing the same data with a new shape.
ndarray.resize(new_shape[, refcheck])   Change shape and size of array in-place.
ndarray.transpose(*axes)    Returns a view of the array with axes transposed.
ndarray.swapaxes(axis1, axis2)  Return a view of the array with axis1 and axis2 interchanged.
ndarray.flatten([order])    Return a copy of the array collapsed into one dimension.
ndarray.ravel([order])  Return a flattened array.
ndarray.squeeze([axis]) Remove single-dimensional entries from the shape of a.

元素索引与转换¶

ndarray.take(indices[, axis, out, mode])    Return an array formed from the elements of a at the given indices.
ndarray.put(indices, values[, mode])    Set a.flat[n] = values[n] for all n in indices.
ndarray.repeat(repeats[, axis]) Repeat elements of an array.
ndarray.choose(choices[, out, mode])    Use an index array to construct a new array from a set of choices.
ndarray.sort([axis, kind, order])   Sort an array, in-place.
ndarray.argsort([axis, kind, order])    Returns the indices that would sort this array.
ndarray.partition(kth[, axis, kind, order]) Rearranges the elements in the array in such a way that value of the element in kth position is in the position it would be in a sorted array.
ndarray.argpartition(kth[, axis, kind, order])  Returns the indices that would partition this array.
ndarray.searchsorted(v[, side, sorter]) Find indices where elements of v should be inserted in a to maintain order.
ndarray.nonzero()   Return the indices of the elements that are non-zero.
ndarray.compress(condition[, axis, out])    Return selected slices of this array along given axis.
ndarray.diagonal([offset, axis1, axis2])    Return specified diagonals.

计算¶

ndarray.argmax([axis, out]) Return indices of the maximum values along the given axis.
ndarray.min([axis, out, keepdims])  Return the minimum along a given axis.
ndarray.argmin([axis, out]) Return indices of the minimum values along the given axis of a.
ndarray.ptp([axis, out])    Peak to peak (maximum - minimum) value along a given axis.
ndarray.clip([min, max, out])   Return an array whose values are limited to [min, max].
ndarray.conj()  Complex-conjugate all elements.
ndarray.round([decimals, out])  Return a with each element rounded to the given number of decimals.
ndarray.trace([offset, axis1, axis2, dtype, out])   Return the sum along diagonals of the array.
ndarray.sum([axis, dtype, out, keepdims])   Return the sum of the array elements over the given axis.
ndarray.cumsum([axis, dtype, out])  Return the cumulative sum of the elements along the given axis.
ndarray.mean([axis, dtype, out, keepdims])  Returns the average of the array elements along given axis.
ndarray.var([axis, dtype, out, ddof, keepdims]) Returns the variance of the array elements, along given axis.
ndarray.std([axis, dtype, out, ddof, keepdims]) Returns the standard deviation of the array elements along given axis.
ndarray.prod([axis, dtype, out, keepdims])  Return the product of the array elements over the given axis
ndarray.cumprod([axis, dtype, out]) Return the cumulative product of the elements along the given axis.
ndarray.all([axis, out, keepdims])  Returns True if all elements evaluate to True.
ndarray.any([axis, out, keepdims])  Returns True if any of the elements of a evaluate to True.

以下为补充内容¶

函数理解¶

numpy.apply_along_axis¶

Apply a function to 1-D slices along the given axis.

对ndarray中的某个维度执行某一函数(即该函数认为输入为某个 1-D ndarray)

numpy.apply_along_axis(func1d, axis, arr, *args, **kwargs)

将指定的维度“看作”唯一的维度,对这个1-D ndarray进行操作

In [97]:
def my_func(a):
    """取该子维度第一个值与最后一个值的平均"""
    return (a[0] + a[-1]) * 0.5


arr = np.arange(60).reshape((3, 4, 5))

np.apply_along_axis(my_func, 0, arr)
np.apply_along_axis(my_func, 1, arr)
Out[97]:
array([[20., 21., 22., 23., 24.],
       [25., 26., 27., 28., 29.],
       [30., 31., 32., 33., 34.],
       [35., 36., 37., 38., 39.]])
Out[97]:
array([[ 7.5,  8.5,  9.5, 10.5, 11.5],
       [27.5, 28.5, 29.5, 30.5, 31.5],
       [47.5, 48.5, 49.5, 50.5, 51.5]])
In [98]:
arr = np.arange(60).reshape((3, 4, 5))
np.apply_along_axis(sorted, 1, arr)
Out[98]:
array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]],

       [[20, 21, 22, 23, 24],
        [25, 26, 27, 28, 29],
        [30, 31, 32, 33, 34],
        [35, 36, 37, 38, 39]],

       [[40, 41, 42, 43, 44],
        [45, 46, 47, 48, 49],
        [50, 51, 52, 53, 54],
        [55, 56, 57, 58, 59]]])
In [99]:
arr = np.arange(60).reshape((3, 4, 5))
res = np.apply_along_axis(np.diag, 1, arr)
res
res.shape
Out[99]:
array([[[[ 0,  1,  2,  3,  4],
         [ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0]],

        [[ 0,  0,  0,  0,  0],
         [ 5,  6,  7,  8,  9],
         [ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0]],

        [[ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0],
         [10, 11, 12, 13, 14],
         [ 0,  0,  0,  0,  0]],

        [[ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0],
         [15, 16, 17, 18, 19]]],


       [[[20, 21, 22, 23, 24],
         [ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0]],

        [[ 0,  0,  0,  0,  0],
         [25, 26, 27, 28, 29],
         [ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0]],

        [[ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0],
         [30, 31, 32, 33, 34],
         [ 0,  0,  0,  0,  0]],

        [[ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0],
         [35, 36, 37, 38, 39]]],


       [[[40, 41, 42, 43, 44],
         [ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0]],

        [[ 0,  0,  0,  0,  0],
         [45, 46, 47, 48, 49],
         [ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0]],

        [[ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0],
         [50, 51, 52, 53, 54],
         [ 0,  0,  0,  0,  0]],

        [[ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0],
         [ 0,  0,  0,  0,  0],
         [55, 56, 57, 58, 59]]]])
Out[99]:
(3, 4, 4, 5)

numpy.roll¶

Roll array elements along a given axis.

沿指定轴顺序平移ndarray元素

np.roll(a, shift, axis=None)
In [100]:
arr = np.arange(12).reshape((3, 4))

np.roll(arr, 2)
Out[100]:
array([[10, 11,  0,  1],
       [ 2,  3,  4,  5],
       [ 6,  7,  8,  9]])
In [101]:
np.roll(arr, 2, 0)
Out[101]:
array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 0,  1,  2,  3]])

统计函数¶

NumPy 里的一些统计函数

scipy.stats 基本统计数据

In [102]:
from scipy import stats
arr = np.arange(12).reshape((3, 4))

stats.describe(arr)
Out[102]:
DescribeResult(nobs=3, minmax=(array([0, 1, 2, 3]), array([ 8,  9, 10, 11])), mean=array([4., 5., 6., 7.]), variance=array([16., 16., 16., 16.]), skewness=array([0., 0., 0., 0.]), kurtosis=array([-1.5, -1.5, -1.5, -1.5]))
In [103]:
# 计算所有元素的和
np.sum(arr)
Out[103]:
66
In [104]:
# 对每一列求和
np.sum(arr, axis=0)
Out[104]:
array([12, 15, 18, 21])
In [105]:
# 对每一行求和

np.sum(arr, axis=1)
Out[105]:
array([ 6, 22, 38])
In [106]:
# 对每一个元素求累积和(从上到下,从左到右的元素顺序),即每移动一次就把当前数字加到和值

np.cumsum(arr)
Out[106]:
array([ 0,  1,  3,  6, 10, 15, 21, 28, 36, 45, 55, 66], dtype=int32)
In [107]:
# 计算每一列的累积和,并返回二维数组

np.cumsum(arr, axis=0)
Out[107]:
array([[ 0,  1,  2,  3],
       [ 4,  6,  8, 10],
       [12, 15, 18, 21]], dtype=int32)
In [108]:
# 计算每一行的累计积,并返回二维数组

np.cumprod(arr, axis=1)
Out[108]:
array([[   0,    0,    0,    0],
       [   4,   20,  120,  840],
       [   8,   72,  720, 7920]], dtype=int32)
In [109]:
# 计算所有元素的最小值

np.min(arr)
Out[109]:
0
In [110]:
# 计算每一列的最大值

np.max(arr, axis=0)
Out[110]:
array([ 8,  9, 10, 11])
In [111]:
# 计算所有元素的均值

np.mean(arr)
Out[111]:
5.5
In [112]:
# 计算每一行的均值

np.mean(arr, axis=1)
Out[112]:
array([1.5, 5.5, 9.5])
In [113]:
# 计算所有元素的中位数

np.median(arr)
Out[113]:
5.5
In [114]:
# 计算每一列的中位数

np.median(arr, axis=0)
Out[114]:
array([4., 5., 6., 7.])
In [115]:
# 计算所有元素的方差

np.var(arr)
Out[115]:
11.916666666666666
In [116]:
# 计算每一行的标准差

np.std(arr, axis=1)
Out[116]:
array([1.11803399, 1.11803399, 1.11803399])

此外还有:

  • unique(x): 计算x的唯一元素,并返回有序结果
  • intersect(x,y): 计算x和y的公共元素,即交集
  • union1d(x,y): 计算x和y的并集
  • setdiff1d(x,y): 计算x和y的差集,即元素在x中,不在y中
  • setxor1d(x,y): 计算集合的对称差,即存在于一个数组中,但不同时存在于两个数组中
  • in1d(x,y): 判断x的元素是否包含于y中

Matrix library¶

numpy.matlib库拥有所有 NumPy 命名空间的函数,只是针对 matrix 替换了以下函数。

numpy namespace 中返回 matrix 的函数

  • mat(data[, dtype]) #Interpret the input as a matrix.
  • matrix # Returns a matrix from an array-like object, or from a string of data.
  • asmatrix(data[, dtype]) Interpret the input as a matrix.
  • bmat(obj[, ldict, gdict]) # Build a matrix object from a string, nested sequence, or array.
  • matlib 库中替换了的函数

  • empty(shape[, dtype, order]) # Return a new matrix of given shape and type, without initializing entries.

  • zeros(shape[, dtype, order]) # Return a matrix of given shape and type, filled with zeros.
  • ones(shape[, dtype, order]) # Matrix of ones.
  • eye(n[, M, k, dtype]) # Return a matrix with ones on the diagonal and zeros elsewhere.
  • identity(n[, dtype]) # Returns the square identity matrix of given size.
  • repmat(a, m, n) # Repeat a 0-D to 2-D array or matrix MxN times.
  • rand(*args) # Return a matrix of random values with given shape.
  • randn(*args) # Return a random matrix with data from the “standard normal” distribution.

区分一下不同 shape 叠加之后的结果,大体上 (n,)与(n,1) 表现类似

In [117]:
x = np.arange(10)  # (10,) shape
y = x.reshape(-1, 1)  # (10, 1) shape
z = x.reshape(1, -1)  # (1, 10) shape

np.vstack([x, x]).shape
np.hstack([x, x]).shape

np.vstack([y, y]).shape
np.hstack([y, y]).shape

np.vstack([z, z]).shape
np.hstack([z, z]).shape
Out[117]:
(2, 10)
Out[117]:
(20,)
Out[117]:
(20, 1)
Out[117]:
(10, 2)
Out[117]:
(2, 10)
Out[117]:
(1, 20)

numpy.random.choice¶

Generates a random sample from a given 1-D array

对给定的 1-D array 进行随机采样

numpy.random.choice(a, size=None, replace=True, p=None)

通过replace参数,控制是否重复选择

random.choices 与其类似

In [118]:
np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0])
np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0])

aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher']
np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3])
Out[118]:
array([0, 2, 3], dtype=int64)
Out[118]:
array([3, 2, 0])
Out[118]:
array(['pooh', 'rabbit', 'rabbit', 'pooh', 'piglet'], dtype='<U11')

array_split¶

Split an array into multiple sub-arrays.

将 arr 分成几个 subarr,返回列表

numpy.array_split(ary, indices_or_sections, axis=0)

注意无法均分时的处理方式 与numpy.split()类似

In [119]:
arr = np.arange(12).reshape((3, 4))

np.array_split(arr, 3)
np.array_split(arr, 2)
Out[119]:
[array([[0, 1, 2, 3]]), array([[4, 5, 6, 7]]), array([[ 8,  9, 10, 11]])]
Out[119]:
[array([[0, 1, 2, 3],
        [4, 5, 6, 7]]), array([[ 8,  9, 10, 11]])]
In [120]:
np.array_split(arr, 2, axis=1)
np.array_split(arr, 3, axis=1)
Out[120]:
[array([[0, 1],
        [4, 5],
        [8, 9]]), array([[ 2,  3],
        [ 6,  7],
        [10, 11]])]
Out[120]:
[array([[0, 1],
        [4, 5],
        [8, 9]]), array([[ 2],
        [ 6],
        [10]]), array([[ 3],
        [ 7],
        [11]])]

numpy.linalg.norm¶

Matrix or vector norm.

返回各种范数

numpy.linalg.norm(x, ord=None, axis=None, keepdims=False)
  • https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.norm.html
  • Algebra Routine
ord norm for matrices   norm for vectors
None    Frobenius norm  2-norm
‘fro’   Frobenius norm  –
‘nuc’   nuclear norm    –
inf max(sum(abs(x), axis=1))    max(abs(x))
-inf    min(sum(abs(x), axis=1))    min(abs(x))
0   –   sum(x != 0)
1   max(sum(abs(x), axis=0))    as below
-1  min(sum(abs(x), axis=0))    as below
2   2-norm (largest sing. value)    as below
-2  smallest singular value as below
other   –   sum(abs(x)**ord)**(1./ord)

The Frobenius norm is given by:

$$ ||A||_F = [\sum_{i,j} abs(a_{i,j})^2]^{1/2} $$
The nuclear norm is the sum of the singular values.
In [121]:
from numpy import linalg as LA
A = np.arange(9) - 4
A

B = A.reshape((3, 3))
B
Out[121]:
array([-4, -3, -2, -1,  0,  1,  2,  3,  4])
Out[121]:
array([[-4, -3, -2],
       [-1,  0,  1],
       [ 2,  3,  4]])
In [122]:
LA.norm(A)
LA.norm(B)
LA.norm(B, 'fro')
LA.norm(A, np.inf)
LA.norm(B, np.inf)
LA.norm(A, -np.inf)
LA.norm(B, -np.inf)
Out[122]:
7.745966692414834
Out[122]:
7.745966692414834
Out[122]:
7.745966692414834
Out[122]:
4.0
Out[122]:
9.0
Out[122]:
0.0
Out[122]:
2.0

bincount¶

Count number of occurrences of each value in array of non-negative ints.

返回从0到x中最大元素之间所有整数的出现次数,没出现的补 0,如果给定了weights,则计数不是按次数而是按照weights计算。

numpy.bincount(x, weights=None, minlength=0)

输入必须为int

In [123]:
np.bincount(np.arange(5))

np.bincount(np.array([0, 1, 1, 3, 2, 1, 7]))
Out[123]:
array([1, 1, 1, 1, 1], dtype=int64)
Out[123]:
array([1, 3, 1, 1, 0, 0, 0, 1], dtype=int64)
In [124]:
arr = np.array([0, 1, 1, 3, 2, 1, 7, 23])

np.bincount(arr).size
np.bincount(arr).size == np.max(arr)+1
Out[124]:
24
Out[124]:
True

unique¶

Find the unique elements of an array.

剔除ndarray中重复的元素,返回排序后的unique元素(1-D)

numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None)

默认axis=None, return_index控制是否返回unique元素相对原始ndarray的“坐标”,return_inverse控制是否返回原始ndarray相对unique元素的“坐标”counts控制是否返回计数

In [125]:
arr = np.array([[1, 2, 1], [2, 3, 4]])
np.unique(arr)
Out[125]:
array([1, 2, 3, 4])

指定了axis之后,是否unique则按指定维度进行判断,若axis=0则返回unique的行(2-D情况下)

In [126]:
arr = np.array([[1, 0, 0], [3, 0, 0], [2, 3, 4]])
np.unique(arr, axis=0)
Out[126]:
array([[1, 0, 0],
       [2, 3, 4],
       [3, 0, 0]])
In [127]:
arr = np.array([[1, 2, 1], [2, 3, 4]])
u, indices = np.unique(arr, return_index=True)

u
indices

arr[np.unravel_index(indices, arr.shape)]
Out[127]:
array([1, 2, 3, 4])
Out[127]:
array([0, 1, 4, 5], dtype=int64)
Out[127]:
array([1, 2, 3, 4])

return_inverse 控制是否返回相完整的,相对 unique 元素的 indice 可用于重新构造原始 ndarray

In [128]:
u, indices = np.unique(arr, return_inverse=True)

u
indices

u[indices].reshape(arr.shape)
Out[128]:
array([1, 2, 3, 4])
Out[128]:
array([0, 1, 0, 1, 2, 3], dtype=int64)
Out[128]:
array([[1, 2, 1],
       [2, 3, 4]])

counts 控制是否返回计数

In [129]:
u, counts = np.unique(arr, return_counts=True)

u
counts
Out[129]:
array([1, 2, 3, 4])
Out[129]:
array([2, 2, 1, 1], dtype=int64)

hypot¶

Given the “legs” of a right triangle, return its hypotenuse.

已知两条直角边,计算斜边

numpy.hypot(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])
In [130]:
np.hypot(3*np.ones((3, 3)), 4*np.ones((3, 3)))
Out[130]:
array([[5., 5., 5.],
       [5., 5., 5.],
       [5., 5., 5.]])

unravel_index¶

Converts a flat index or array of flat indices into a tuple of coordinate arrays.

将一维化“坐标”转换为一组反直觉“坐标”(列表),即列表中的每一项(ndarray)对应一个维度。

unravel_index(indices, shape, order='C')
In [131]:
np.unravel_index([22, 41, 37], (7, 6))

np.unravel_index([1621, 1929], (6, 7, 8, 9))
Out[131]:
(array([3, 6, 6], dtype=int64), array([4, 5, 1], dtype=int64))
Out[131]:
(array([3, 3], dtype=int64),
 array([1, 5], dtype=int64),
 array([4, 6], dtype=int64),
 array([1, 3], dtype=int64))
In [132]:
arr = np.arange(2000)

np.unravel_index(arr, (1000, 2))
Out[132]:
(array([  0,   0,   1, ..., 998, 999, 999], dtype=int64),
 array([0, 1, 0, ..., 1, 0, 1], dtype=int64))

NumPy 数据类型一览表¶

Type    Name    Bytes   Description
bool    b   1   Boolean (True or False) stored as a byte
int l   4-8 Platform (long) integer (normally either int32 or int64)
intp    p   4-8 Integer used for indexing (normally either int32 or int64)
int8    i1  1   Byte (-128 to 127)
int16   i2  2   Integer (-32768 to 32767)
int32   i4  4   Integer (-2147483648 to 2147483647)
int64   i8  8   Integer (-9223372036854775808 to 9223372036854775807)
uint8   u1  1   Unsigned integer (0 to 255)
uint16  u2  2   Unsigned integer (0 to 65535)
uint32  u4  4   Unsigned integer (0 to 4294967295)
uint64  u8  8   Unsigned integer (0 to 18446744073709551615)
float   f8  8   Shorthand for float64
float16 f2  2   Half precision float: sign bit, 5 bits exponent, 10 bits mantissa
float32 f   4   Single precision float: sign bit, 8 bits exponent, 23 bits mantissa
float64 d   8   Double precision float: sign bit, 11 bits exponent, 52 bits mantissa
complex c16 16  Shorthand for complex128.
complex64   c8  8   Complex number, represented by two 32-bit floats
complex128  c16 16  Complex number, represented by two 64-bit floats

特定任务¶

ndarray导入与导出¶

NumPy 提供了多种应对各种情况的导入导出方式,官方推荐使用 save 和 load 函数进行导入导出 npy 文件,npy 格式支持一键导入导出,无需额外设置,导出啥样导入还是啥样,无缝衔接

导出¶

savetxt¶

Save an array to a text file.

将 ndarray 保存为文本文件

numpy.savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\n', header='', footer='', comments='# ', encoding=None)

savetxt 默认导出格式为科学计数

使用 savetxt 保存数据时,最好指定编码格式 encoding,同时可以指定 header,comments 和 encoding

In [133]:
arr = np.arange(12).reshape((3, 4))

# 导出为 csv
np.savetxt('data/output.csv', arr, delimiter=',', header='',
           comments='', encoding='utf-8')  # float 格式

# 导出为 txt
np.savetxt('data/output.txt', arr, delimiter=' ',
           header='', comments='', encoding='utf-8')

tofile¶

Write array to a file as text or binary (default).

将ndarray以文本或二进制格式写入文件

ndarray.tofile(fid, sep="", format="%s")

默认为二进制形式

In [134]:
import os
arr = np.zeros(
    (2,), dtype=[('time', [('min', int), ('sec', int)]), ('temp', float)])

arr[0]['time']['min'] = 10
arr['temp'] = 98.25
arr

arr.tofile('data/temp.b')
Out[134]:
array([((10, 0), 98.25), (( 0, 0), 98.25)],
      dtype=[('time', [('min', '<i4'), ('sec', '<i4')]), ('temp', '<f8')])

np.save¶

Save an array to a binary file in NumPy .npy format.

把ndarray保存为npy文件

numpy.save(file, arr, allow_pickle=True, fix_imports=True)
In [135]:
np.save("data/output.npy", arr)

导入¶

loadtxt¶

Load data from a text file.

从文本文件中读取数据

numpy.loadtxt(fname, dtype=<class 'float'>, comments='#', delimiter=None, converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0, encoding='bytes')

每行的数值必须个数相同

In [136]:
np.loadtxt('data/output.csv', delimiter=',')
np.loadtxt('data/output.txt', delimiter=' ')
Out[136]:
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]])
Out[136]:
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]])

genfromtxt¶

Load data from a text file, with missing values handled as specified.

从文本文件读取数据,缺失值特别处理

numpy.genfromtxt(fname, dtype=<class 'float'>, comments='#', delimiter=None, skip_header=0, skip_footer=0, converters=None, missing_values=None, filling_values=None, usecols=None, names=None, excludelist=None, deletechars=None, replace_space='_', autostrip=False, case_sensitive=True, defaultfmt='f%i', unpack=None, usemask=False, loose=True, invalid_raise=True, max_rows=None, encoding='bytes')
In [137]:
np.genfromtxt('data/output.csv', delimiter=',')
np.genfromtxt('data/output.txt', delimiter=' ')
Out[137]:
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]])
Out[137]:
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]])

fromfile¶

Construct an array from data in a text or binary file.

从文本文件或二进制文件中读取数据

numpy.fromfile(file, dtype=float, count=-1, sep='')

In [138]:
np.fromfile("data/temp.b",
            dtype=[('time', [('min', int), ('sec', int)]), ('temp', float)])
Out[138]:
array([((10, 0), 98.25), (( 0, 0), 98.25)],
      dtype=[('time', [('min', '<i4'), ('sec', '<i4')]), ('temp', '<f8')])

np.load¶

Load arrays or pickled objects from .npy, .npz or pickled files.

从 npy npz 或 pickle文件中读取 ndarray 对象

numpy.load(
    ['file', 'mmap_mode=None', 'allow_pickle=True', 'fix_imports=True', "encoding='ASCII'"],
)
In [139]:
np.load("data/output.npy")
Out[139]:
array([((10, 0), 98.25), (( 0, 0), 98.25)],
      dtype=[('time', [('min', '<i4'), ('sec', '<i4')]), ('temp', '<f8')])

交换ndarray的轴[广义转置]¶

Permute the dimensions of an array.

交换ndarray的维度,返回视图

numpy.transpose(a, axes=None)
In [140]:
arr = np.arange(24).reshape((2, 3, 4))
arr

np.transpose(arr)  # 完全倒置
np.transpose(arr).shape
Out[140]:
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])
Out[140]:
array([[[ 0, 12],
        [ 4, 16],
        [ 8, 20]],

       [[ 1, 13],
        [ 5, 17],
        [ 9, 21]],

       [[ 2, 14],
        [ 6, 18],
        [10, 22]],

       [[ 3, 15],
        [ 7, 19],
        [11, 23]]])
Out[140]:
(4, 3, 2)
In [141]:
np.transpose(arr, (1, 0, 2))
np.transpose(arr, (1, 0, 2)).shape
Out[141]:
array([[[ 0,  1,  2,  3],
        [12, 13, 14, 15]],

       [[ 4,  5,  6,  7],
        [16, 17, 18, 19]],

       [[ 8,  9, 10, 11],
        [20, 21, 22, 23]]])
Out[141]:
(3, 2, 4)
In [142]:
np.transpose(arr, (0, 2, 1))
np.transpose(arr, (0, 2, 1)).shape
Out[142]:
array([[[ 0,  4,  8],
        [ 1,  5,  9],
        [ 2,  6, 10],
        [ 3,  7, 11]],

       [[12, 16, 20],
        [13, 17, 21],
        [14, 18, 22],
        [15, 19, 23]]])
Out[142]:
(2, 4, 3)

NumPy向量化编程¶

参考:

  • NumPy Functional programming

NumPy 函数式编程主要有以下几种方式

  • apply_along_axis(func1d, axis, arr, *args, ...) Apply a function to 1-D slices along the given axis.
  • apply_over_axes(func, a, axes) Apply a function repeatedly over multiple axes.
  • vectorize(pyfunc[, otypes, doc, excluded, ...]) Generalized function class.
  • frompyfunc(func, nin, nout) Takes an arbitrary Python function and returns a NumPy ufunc.
  • piecewise(x, condlist, funclist, *args, **kw) Evaluate a piecewise-defined function.

numpy.vectorize¶

class numpy.vectorize(pyfunc, otypes=None, doc=None, excluded=None, cache=False, signature=None)

Define a vectorized function which takes a nested sequence of objects or numpy arrays as inputs and returns an single or tuple of numpy array as output. The vectorized function evaluates pyfunc over successive tuples of the input arrays like the python map function, except it uses the broadcasting rules of numpy.

基于输入的 python func 返回一个向量化的函数

In [143]:
def myfunc(a, b):
    "Return a-b if a>b, otherwise return a+b"
    if a > b:
        return a - b
    else:
        return a + b


vfunc = np.vectorize(myfunc)
vfunc([1, 2, 3, 4], 2)
Out[143]:
array([3, 4, 1, 2])

给ndarray增加新的维度¶

使用 newaxis 或者 None¶

In [144]:
from numpy import newaxis

arr = np.arange(12).reshape((3, 4))
arr.shape

arr[:, newaxis].shape
arr[:, None].shape

arr[newaxis, :].shape
arr[None, :].shape
Out[144]:
(3, 4)
Out[144]:
(3, 1, 4)
Out[144]:
(3, 1, 4)
Out[144]:
(1, 3, 4)
Out[144]:
(1, 3, 4)

使用reshape¶

In [145]:
arr = np.arange(12).reshape((3, 4))
arr.shape

arr.reshape((-1, 2, 3)).shape
arr.reshape((2, 3, -1)).shape
Out[145]:
(3, 4)
Out[145]:
(2, 2, 3)
Out[145]:
(2, 3, 2)

自定义 dtype¶

创建 array 时自定义 dtype 类型

In [146]:
custom_ndarray = np.zeros(5, dtype=[('position', float, 2),
                                    ('size', float, 1),
                                    ('growth', float, 1),
                                    ('color', float, 4),
                                    ('name', str, 1)])
custom_ndarray
custom_ndarray[0]
custom_ndarray[0]['position']
Out[146]:
array([([0., 0.], 0., 0., [0., 0., 0., 0.], ''),
       ([0., 0.], 0., 0., [0., 0., 0., 0.], ''),
       ([0., 0.], 0., 0., [0., 0., 0., 0.], ''),
       ([0., 0.], 0., 0., [0., 0., 0., 0.], ''),
       ([0., 0.], 0., 0., [0., 0., 0., 0.], '')],
      dtype=[('position', '<f8', (2,)), ('size', '<f8'), ('growth', '<f8'), ('color', '<f8', (4,)), ('name', '<U1')])
Out[146]:
([0., 0.], 0., 0., [0., 0., 0., 0.], '')
Out[146]:
array([0., 0.])

获取ndarray中出现次数最多的元素¶

参考:https://stackoverflow.com/questions/12297016/how-to-find-most-frequent-values-in-numpy-ndarray

1维¶

In [147]:
arr = np.array([5, 4, -2, 1, -2, 0, 4, 4, -6, -1])
u, indices = np.unique(arr, return_inverse=True)
u
indices

count = np.bincount(indices)
count

u[np.argmax(count)]
Out[147]:
array([-6, -2, -1,  0,  1,  4,  5])
Out[147]:
array([6, 5, 1, 4, 1, 3, 5, 5, 0, 2], dtype=int64)
Out[147]:
array([1, 2, 1, 1, 1, 3, 1], dtype=int64)
Out[147]:
4

多维¶

获取某一维度上出现次数最多的元素

In [148]:
arr = np.array([[5, 5, 5, 5, -2, 0, 4, 4, -6, -1],
                [0, 1,  1, 2,  3, 4, 5, 6,  7,  8]])

u, indices = np.unique(arr, return_inverse=True)
u
indices


# 这里需要指定 bincount 的 minlenghth
counted = np.apply_along_axis(np.bincount, 1, indices.reshape(arr.shape),
                              None, np.max(indices) + 1)
counted

u[np.argmax(counted, axis=1)]
Out[148]:
array([-6, -2, -1,  0,  1,  2,  3,  4,  5,  6,  7,  8])
Out[148]:
array([ 8,  8,  8,  8,  1,  3,  7,  7,  0,  2,  3,  4,  4,  5,  6,  7,  8,
        9, 10, 11], dtype=int64)
Out[148]:
array([[1, 1, 1, 1, 0, 0, 0, 2, 4, 0, 0, 0],
       [0, 0, 0, 1, 2, 1, 1, 1, 1, 1, 1, 1]], dtype=int64)
Out[148]:
array([5, 1])

ndarray转换为DataFrame(多维->2维)¶

多维ndarray从数据结构上来说是比较高效的,但如果需要使用Pandas进行一些复杂的数据处理则有些麻烦,因为Pandas处理的数据本质是2-D的(虽然可以使用MultiIndex),此时需要将ndarray中多余的维度坍缩到2-D的DataFrame中。

比较保守的转换¶

现有维度为 (50, 100, 3) 的数据,第一维度对应时间t,第二维度对应个体ID,第三维度对应个体坐标 x,y,z。

若使用Pandas进行处理,第一种转换方式是将ndarray转换为(5000, 5)的二维DataFrame,其中5000对应50x100,第二维度在x,y,z基础上增加两列t和ID,则列标签分别为t, ID, x, y, z.

In [149]:
data = np.load("data/sample.npy")
data.shape
data
Out[149]:
(50, 100, 3)
Out[149]:
array([[[ 2.42442956e+02,  7.76911920e+01,  6.64777151e-01],
        [ 2.61380074e+02,  2.01793185e+02,  2.94516922e+00],
        [ 4.12767690e+02,  1.35482822e+02, -4.92483385e-01],
        ...,
        [ 4.10753164e+02,  2.02361917e+02, -1.47121999e-01],
        [ 2.69633830e+02,  2.68148789e+02,  1.27458590e+00],
        [ 3.30322105e+02,  1.75890005e+02, -7.97956043e-01]],

       [[ 2.45365704e+02,  7.83676099e+01,  2.27428156e-01],
        [ 2.58458717e+02,  2.02475590e+02,  2.91211536e+00],
        [ 4.14913593e+02,  1.33386372e+02, -7.73741839e-01],
        ...,
        [ 4.11856535e+02,  1.99572191e+02, -1.19416454e+00],
        [ 2.70467390e+02,  2.71030660e+02,  1.28923763e+00],
        [ 3.32290845e+02,  1.73626366e+02, -8.54962584e-01]],

       [[ 2.48239647e+02,  7.75071160e+01, -2.90917521e-01],
        [ 2.55461149e+02,  2.02596349e+02, -3.18185644e+00],
        [ 4.17462899e+02,  1.31804904e+02, -5.55250026e-01],
        ...,
        [ 4.09707081e+02,  1.97479383e+02, -2.36954656e+00],
        [ 2.68405895e+02,  2.73210163e+02,  2.32837616e+00],
        [ 3.33943130e+02,  1.71122378e+02, -9.87520127e-01]],

       ...,

       [[ 3.56975004e+02,  5.97239259e+00, -5.75102134e-01],
        [ 1.30143542e+02,  1.64376675e+02, -2.87853593e+00],
        [ 4.74988523e+02,  1.15517297e+01, -8.53119091e-01],
        ...,
        [ 4.27604192e+02,  8.84718553e+01, -7.13370919e-01],
        [ 1.53469992e+02,  2.17281949e+02, -3.01382082e+00],
        [ 3.95602341e+02,  5.33207351e+01, -9.52802521e-01]],

       [[ 3.58316991e+02,  3.28928418e+00, -1.10701967e+00],
        [ 1.27151855e+02,  1.64599848e+02, -3.21605251e+00],
        [ 4.75983562e+02,  8.72155337e+00, -1.23271295e+00],
        ...,
        [ 4.29080172e+02,  8.58600584e+01, -1.05641846e+00],
        [ 1.50495321e+02,  2.16892942e+02, -3.01155720e+00],
        [ 3.97560036e+02,  5.10475365e+01, -8.59831857e-01]],

       [[ 3.60269963e+02,  1.01202655e+00, -8.61907762e-01],
        [ 1.24279619e+02,  1.63733672e+02, -2.84869734e+00],
        [ 4.77039363e+02,  5.91347873e+00, -1.21116002e+00],
        ...,
        [ 4.31150993e+02,  8.36894137e+01, -8.08928892e-01],
        [ 1.47569414e+02,  2.16230318e+02, -2.91888170e+00],
        [ 3.99619625e+02,  4.88662316e+01, -8.14090703e-01]]])
In [150]:
import pandas as pd
dim_1, dim_2, dim_3 = data.shape

# 生成用于填充新增维度的数值
indice = pd.MultiIndex.from_product(
    [np.arange(dim_1), np.arange(dim_2)], names=['t', 'ID'])
indice
Out[150]:
MultiIndex(levels=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]],
           labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]],
           names=['t', 'ID'])
In [151]:
data = data.reshape((-1, dim_3))
data.shape
data
Out[151]:
(5000, 3)
Out[151]:
array([[242.44295567,  77.69119197,   0.66477715],
       [261.38007362, 201.79318451,   2.94516922],
       [412.76769009, 135.48282173,  -0.49248338],
       ...,
       [431.15099334,  83.68941366,  -0.80892889],
       [147.56941378, 216.23031823,  -2.9188817 ],
       [399.61962538,  48.86623161,  -0.8140907 ]])

转换为 DataFrame

In [152]:
df = pd.DataFrame(data, index=indice, columns=["x", "y", "z"])
df.head()
Out[152]:
x y z
t ID
0 0 242.442956 77.691192 0.664777
1 261.380074 201.793185 2.945169
2 412.767690 135.482822 -0.492483
3 135.073158 406.116724 2.671991
4 235.803192 187.694907 2.775133

转换为更“朴素”的形式

In [153]:
df.reset_index().head()
Out[153]:
t ID x y z
0 0 0 242.442956 77.691192 0.664777
1 0 1 261.380074 201.793185 2.945169
2 0 2 412.767690 135.482822 -0.492483
3 0 3 135.073158 406.116724 2.671991
4 0 4 235.803192 187.694907 2.775133

补充:利用stack压缩列标签,转换为激进型¶

In [154]:
ses = df.stack() # 将“x” “y”“z” 压缩进index中
ses.head()
Out[154]:
t  ID   
0  0   x    242.442956
       y     77.691192
       z      0.664777
   1   x    261.380074
       y    201.793185
dtype: float64
In [155]:
ses.index.names
ses.index.names = ['t', 'ID', "cat"] # 增补类型名

df = pd.DataFrame(ses.values, index=ses.index, columns=["value"])
df.head()
Out[155]:
FrozenList(['t', 'ID', None])
Out[155]:
value
t ID cat
0 0 x 242.442956
y 77.691192
z 0.664777
1 x 261.380074
y 201.793185

转换为更“朴素”的形式

In [156]:
df.reset_index().head()
Out[156]:
t ID cat value
0 0 0 x 242.442956
1 0 0 y 77.691192
2 0 0 z 0.664777
3 0 1 x 261.380074
4 0 1 y 201.793185

比较激进的转换¶

如果更进一步,将最后一维的三个值也坍缩到新增维度里,即将 (50, 100, 3) 的数据转换为 (15000, 4) 的数据,对应列标签 t, ID, cat, value,其中 cat 中包含 x, y, z 三个种类,这种格式可能对各种Pandas处理是更普适的形式。

In [157]:
data = np.load("data/sample.npy")
data.shape
data
Out[157]:
(50, 100, 3)
Out[157]:
array([[[ 2.42442956e+02,  7.76911920e+01,  6.64777151e-01],
        [ 2.61380074e+02,  2.01793185e+02,  2.94516922e+00],
        [ 4.12767690e+02,  1.35482822e+02, -4.92483385e-01],
        ...,
        [ 4.10753164e+02,  2.02361917e+02, -1.47121999e-01],
        [ 2.69633830e+02,  2.68148789e+02,  1.27458590e+00],
        [ 3.30322105e+02,  1.75890005e+02, -7.97956043e-01]],

       [[ 2.45365704e+02,  7.83676099e+01,  2.27428156e-01],
        [ 2.58458717e+02,  2.02475590e+02,  2.91211536e+00],
        [ 4.14913593e+02,  1.33386372e+02, -7.73741839e-01],
        ...,
        [ 4.11856535e+02,  1.99572191e+02, -1.19416454e+00],
        [ 2.70467390e+02,  2.71030660e+02,  1.28923763e+00],
        [ 3.32290845e+02,  1.73626366e+02, -8.54962584e-01]],

       [[ 2.48239647e+02,  7.75071160e+01, -2.90917521e-01],
        [ 2.55461149e+02,  2.02596349e+02, -3.18185644e+00],
        [ 4.17462899e+02,  1.31804904e+02, -5.55250026e-01],
        ...,
        [ 4.09707081e+02,  1.97479383e+02, -2.36954656e+00],
        [ 2.68405895e+02,  2.73210163e+02,  2.32837616e+00],
        [ 3.33943130e+02,  1.71122378e+02, -9.87520127e-01]],

       ...,

       [[ 3.56975004e+02,  5.97239259e+00, -5.75102134e-01],
        [ 1.30143542e+02,  1.64376675e+02, -2.87853593e+00],
        [ 4.74988523e+02,  1.15517297e+01, -8.53119091e-01],
        ...,
        [ 4.27604192e+02,  8.84718553e+01, -7.13370919e-01],
        [ 1.53469992e+02,  2.17281949e+02, -3.01382082e+00],
        [ 3.95602341e+02,  5.33207351e+01, -9.52802521e-01]],

       [[ 3.58316991e+02,  3.28928418e+00, -1.10701967e+00],
        [ 1.27151855e+02,  1.64599848e+02, -3.21605251e+00],
        [ 4.75983562e+02,  8.72155337e+00, -1.23271295e+00],
        ...,
        [ 4.29080172e+02,  8.58600584e+01, -1.05641846e+00],
        [ 1.50495321e+02,  2.16892942e+02, -3.01155720e+00],
        [ 3.97560036e+02,  5.10475365e+01, -8.59831857e-01]],

       [[ 3.60269963e+02,  1.01202655e+00, -8.61907762e-01],
        [ 1.24279619e+02,  1.63733672e+02, -2.84869734e+00],
        [ 4.77039363e+02,  5.91347873e+00, -1.21116002e+00],
        ...,
        [ 4.31150993e+02,  8.36894137e+01, -8.08928892e-01],
        [ 1.47569414e+02,  2.16230318e+02, -2.91888170e+00],
        [ 3.99619625e+02,  4.88662316e+01, -8.14090703e-01]]])
In [158]:
import pandas as pd
dim_1, dim_2, dim_3 = data.shape

# 生成用于填充新增维度的数值
indice = pd.MultiIndex.from_product([np.arange(dim_1), np.arange(dim_2), [
                                    "x", "y", "z"]], names=['t', 'ID', 'cat'])

data = data.reshape((-1, 1))

df = pd.DataFrame(data, index=indice, columns=["value"])
df.head()
Out[158]:
value
t ID cat
0 0 x 242.442956
y 77.691192
z 0.664777
1 x 261.380074
y 201.793185

转换为更“朴素”的形式

In [159]:
df.reset_index().head()
Out[159]:
t ID cat value
0 0 0 x 242.442956
1 0 0 y 77.691192
2 0 0 z 0.664777
3 0 1 x 261.380074
4 0 1 y 201.793185

问题与分析¶

numpy.sort与numpy.argsort¶

numpy.sort¶

Return a sorted copy of an array.

返回ndarray的排序副本

numpy.sort(a, axis=-1, kind='quicksort', order=None)
In [160]:
arr = np.array([[1, 4], [3, 2]])

np.sort(arr)  # 对最后一维排序

np.sort(a, axis=None)  # 扁平化排序

np.sort(a, axis=0)  # 沿指定轴排序
Out[160]:
array([[1, 4],
       [2, 3]])
Out[160]:
array([2, 3, 4, 5])
Out[160]:
array([2, 3, 4, 5])

按 key 排序

In [161]:
dtype = [('name', 'S10'), ('height', float), ('age', int)]
values = [('Arthur', 1.8, 41), ('Lancelot', 1.9, 38),
          ('Galahad', 1.7, 38)]

arr = np.array(values, dtype=dtype)

np.sort(arr, order='height')

np.sort(arr, order=['age', 'height'])
Out[161]:
array([(b'Galahad', 1.7, 38), (b'Arthur', 1.8, 41),
       (b'Lancelot', 1.9, 38)],
      dtype=[('name', 'S10'), ('height', '<f8'), ('age', '<i4')])
Out[161]:
array([(b'Galahad', 1.7, 38), (b'Lancelot', 1.9, 38),
       (b'Arthur', 1.8, 41)],
      dtype=[('name', 'S10'), ('height', '<f8'), ('age', '<i4')])

numpy.argsort¶

Returns the indices that would sort an array.

返回排序后的ndarray元素在原始ndarray中对应的“坐标”

numpy.argsort(a, axis=-1, kind='quicksort', order=None)
In [162]:
arr = np.array([[0, 3, 4], [2, 2, 2]])
arr

np.argsort(arr, axis=None)  # 全排序 扁平化
np.argsort(arr, axis=0)  # 按轴排序
np.argsort(arr, axis=1)
Out[162]:
array([[0, 3, 4],
       [2, 2, 2]])
Out[162]:
array([0, 3, 4, 5, 1, 2], dtype=int64)
Out[162]:
array([[0, 1, 1],
       [1, 0, 0]], dtype=int64)
Out[162]:
array([[0, 1, 2],
       [0, 1, 2]], dtype=int64)

通过indice构造排序后的ndarray(只适用于axis=None的情况)

In [163]:
indice = np.unravel_index(np.argsort(arr, axis=None), arr.shape)
indice

arr[indice]  # 构造排序后的ndarray
Out[163]:
(array([0, 1, 1, 1, 0, 0], dtype=int64),
 array([0, 0, 1, 2, 1, 2], dtype=int64))
Out[163]:
array([0, 2, 2, 2, 3, 4])
In [164]:
indice = np.unravel_index(np.argsort(arr, axis=0), arr.shape)
indice

arr[indice]  # 和 np.sort(x, axis=0) 不同
Out[164]:
(array([[0, 0, 0],
        [0, 0, 0]], dtype=int64), array([[0, 1, 1],
        [1, 0, 0]], dtype=int64))
Out[164]:
array([[0, 3, 3],
       [3, 0, 0]])

正确通过indice构造排序后的ndarray(适用于axis不为None的情况)

实际上就是补上其他维度的“坐标”

In [165]:
indice0 = np.argsort(arr, axis=0)
indice0

indice1 = np.mgrid[0:2, 0:3][1]
indice1

arr[indice0, indice1]
Out[165]:
array([[0, 1, 1],
       [1, 0, 0]], dtype=int64)
Out[165]:
array([[0, 1, 2],
       [0, 1, 2]])
Out[165]:
array([[0, 2, 2],
       [2, 3, 4]])

按 key 排序

In [166]:
arr = np.array([(1, 0), (0, 1)], dtype=[('x', '<i4'), ('y', '<i4')])
arr

np.argsort(arr, order=('x', 'y'))
np.argsort(arr, order=('y', 'x'))
Out[166]:
array([(1, 0), (0, 1)], dtype=[('x', '<i4'), ('y', '<i4')])
Out[166]:
array([1, 0], dtype=int64)
Out[166]:
array([0, 1], dtype=int64)

outer¶

Apply the ufunc op to all pairs (a, b) with a in A and b in B.

对输入中的所有a in A 和 b in B组合执行 ufunc

ufunc.outer(A, B, **kwargs)

执行机制类似于双层for循环

r = empty(len(A),len(B))
for i in range(len(A)):
    for j in range(len(B)):
        r[i,j] = op(A[i], B[j]) # op = ufunc in question
In [167]:
A = np.array([1, 2, 3])
B = np.array([4, 5, 6])

np.multiply.outer(A, B)
Out[167]:
array([[ 4,  5,  6],
       [ 8, 10, 12],
       [12, 15, 18]])

维度比较复杂时

In [168]:
A = np.array([[1, 2, 3], [4, 5, 6]])
A.shape

B = np.array([[1, 2], [3, 4]])
B.shape

C = np.multiply.outer(A, B)
C
C.shape
Out[168]:
(2, 3)
Out[168]:
(2, 2)
Out[168]:
array([[[[ 1,  2],
         [ 3,  4]],

        [[ 2,  4],
         [ 6,  8]],

        [[ 3,  6],
         [ 9, 12]]],


       [[[ 4,  8],
         [12, 16]],

        [[ 5, 10],
         [15, 20]],

        [[ 6, 12],
         [18, 24]]]])
Out[168]:
(2, 3, 2, 2)
In [169]:
C[1, 1, 1, 1]

A[1, 1]*B[1, 1]
Out[169]:
20
Out[169]:
20

(2,3) (2,2) -> (2,3,2,2)

numpy.where,numpy.nonzero和numpy.argwhere¶

where与argwhere的主要区别在于其给出“坐标”的排列方式的区别

numpy.where¶

Return elements, either from x or y, depending on condition. If only condition is given, return condition.nonzero().

返回依据条件从x或y中选取的值,如果只给定了条件,则返回condition.nonzero()

numpy.where(condition[, x, y])

numpy.where 有两种用途

  • 给定 condition 和 x, y
  • 只给定 condition,返回condition.nonzero()即复合条件的元素的反直觉“坐标”

同时给定condition和x, y¶

大概相当于(以1-D为例)

[xv if c else yv for (c,xv,yv) in zip(condition,x,y)]

In [170]:
np.where([[True, False], [True, True]], [[1, 2], [3, 4]], [[9, 8], [7, 6]])

# 类似 [xv if c else yv for (c,xv,yv) in zip(condition,x,y)]
arr = np.arange(9.).reshape(3, 3)
np.where(arr < 5, arr, -1)
Out[170]:
array([[1, 8],
       [3, 4]])
Out[170]:
array([[ 0.,  1.,  2.],
       [ 3.,  4., -1.],
       [-1., -1., -1.]])

不给定x y只给定一个条件¶

则返回一个“坐标”(不符合直觉的“坐标”,是一个列表,列表的每一项对应一个维度上所有元素的坐标值,这种坐标可用于反向索引得到对应数据)

In [171]:
np.where([[0, 1], [1, 0]])  # 等价于 np.where(np.array([[0, 1], [1, 0]])!=0)

np.where(np.array([[0, 1], [1, 0]]) != 0)
Out[171]:
(array([0, 1], dtype=int64), array([1, 0], dtype=int64))
Out[171]:
(array([0, 1], dtype=int64), array([1, 0], dtype=int64))
In [172]:
arr = np.arange(9.).reshape(3, 3)

np.where(arr > 5)  # 返回“坐标”
Out[172]:
(array([2, 2, 2], dtype=int64), array([0, 1, 2], dtype=int64))

采用这种 indexing 方式,得到的结果为 1-D,前文解释了,因为作为索引的序列为1-D,且对arr前2维度进行索引得到的也是单个元素,不会对结果的维度进行增补

In [173]:
indice = np.where(arr > 5)
indice

arr[indice]
Out[173]:
(array([2, 2, 2], dtype=int64), array([0, 1, 2], dtype=int64))
Out[173]:
array([6., 7., 8.])

numpy.nonzero¶

Return the indices of the elements that are non-zero.

返回非零元素“坐标”

numpy.nonzero(a)

给出的结果也是 反直觉“坐标”

In [174]:
arr = np.arange(12).reshape(3, 4)

np.nonzero(arr)

arr[np.nonzero(arr)]  # 反向 indexing 得到对应数据
Out[174]:
(array([0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2], dtype=int64),
 array([1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3], dtype=int64))
Out[174]:
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

numpy.argwhere¶

Find the indices of array elements that are non-zero, grouped by element.

检索非零元素

numpy.argwhere(a)

返回的结果是一个个的 坐标 (符合直觉的坐标,每个元素都由(x,y,...)构成,即一个坐标)

In [175]:
arr = np.arange(12).reshape(3, 4)
arr

np.argwhere(arr > 1)
Out[175]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
Out[175]:
array([[0, 2],
       [0, 3],
       [1, 0],
       [1, 1],
       [1, 2],
       [1, 3],
       [2, 0],
       [2, 1],
       [2, 2],
       [2, 3]], dtype=int64)

这种坐标用于索引,因为只对第一维度(ndarray 被视为一个整体)进行索引,显然不能得到条件对应的元素

mgrid, ogrid, meshgrid, ndenumerate与indices¶

这些函数或方法都与“坐标”有关

mgrid¶

nd_grid instance which returns a dense multi-dimensional “meshgrid”.

返回一个类似网格化坐标的ndarray

numpy.mgrid = <numpy.lib.index_tricks.nd_grid object>

对于输入序列 x1, x2,…, ‘xn,长度分别为 Ni=len(xi),返回ndarray形状为(n, N1, N2, N3,...Nn) 的ndarray,其中x1在ndarray的第一个元素的第1维上排列,x2在ndarray的第二个元素的第2维上排列,以此类推。

In [176]:
arr = np.mgrid[0:4:2, 0:6:2]  # 可以加 step
arr

arr.shape

np.mgrid[-1:1:5j]  # 也可以是复数
Out[176]:
array([[[0, 0, 0],
        [2, 2, 2]],

       [[0, 2, 4],
        [0, 2, 4]]])
Out[176]:
(2, 2, 3)
Out[176]:
array([-1. , -0.5,  0. ,  0.5,  1. ])
In [177]:
arr = np.mgrid[0:4:2, 0:6:2, 0:8:2]  # 可以加 step
arr

arr.shape
Out[177]:
array([[[[0, 0, 0, 0],
         [0, 0, 0, 0],
         [0, 0, 0, 0]],

        [[2, 2, 2, 2],
         [2, 2, 2, 2],
         [2, 2, 2, 2]]],


       [[[0, 0, 0, 0],
         [2, 2, 2, 2],
         [4, 4, 4, 4]],

        [[0, 0, 0, 0],
         [2, 2, 2, 2],
         [4, 4, 4, 4]]],


       [[[0, 2, 4, 6],
         [0, 2, 4, 6],
         [0, 2, 4, 6]],

        [[0, 2, 4, 6],
         [0, 2, 4, 6],
         [0, 2, 4, 6]]]])
Out[177]:
(3, 2, 3, 4)

ogrid¶

nd_grid instance which returns an open multi-dimensional “meshgrid”.

返回一个类似网格化坐标的列表,列表中为几个ndarray

numpy.ogrid = <numpy.lib.index_tricks.nd_grid object>

对于输入序列 x1, x2,…, ‘xn,长度分别为 Ni=len(xi),返回一个列表,其中第一个形状为(N1,1,1,...)第二个为(1,N2,1,...),依次类推。其中x1在第一个ndarray的第1维上排列,x2在第二个ndarray的第2维上排列,以此类推。

In [178]:
ls = np.ogrid[0:4:2, 0:6:2]
ls
ls[0].shape, ls[1].shape

np.ogrid[-1:1:5j]  # 也可以是复数
Out[178]:
[array([[0],
        [2]]), array([[0, 2, 4]])]
Out[178]:
((2, 1), (1, 3))
Out[178]:
array([-1. , -0.5,  0. ,  0.5,  1. ])
In [179]:
ls = np.ogrid[0:4:2, 0:6:2, 0:8:2]
ls
ls[0].shape, ls[1].shape, ls[2].shape
Out[179]:
[array([[[0]],
 
        [[2]]]), array([[[0],
         [2],
         [4]]]), array([[[0, 2, 4, 6]]])]
Out[179]:
((2, 1, 1), (1, 3, 1), (1, 1, 4))

meshgrid¶

Return coordinate matrices from coordinate vectors.

基于坐标向量返回一个类似网格化坐标的列表,列表中为几个ndarray

numpy.meshgrid(*xi, **kwargs)

对于输入序列 x1, x2,…, ‘xn,长度分别为 Ni=len(xi),

  • 如果indexing=’ij’返回列表中包含n个形状为(N1, N2, N3,...Nn) 的ndarray,其中x1在第一个ndarray的第1维上排列,x2在第二个ndarray的第2维上排列,以此类推。
  • 如果indexing=’xy’返回列表中包含n个形状为(N2, N1, N3,...Nn) 的ndarray,其中x1在第一个ndarray的第2维上排列,x2在第二个ndarray的第1维上排列,其他和上一中情况类似,以此类推。

与 mgrid 的区别除了返回类型为列表外,排列顺序也更为复杂(分两种情况)。

In [180]:
x = np.arange(0, 4, 2)
y = np.arange(0, 6, 2)
z = np.arange(0, 8, 2)
In [181]:
ls = np.meshgrid(x, y, z, indexing='ij')

ls[0].shape
ls
Out[181]:
(2, 3, 4)
Out[181]:
[array([[[0, 0, 0, 0],
         [0, 0, 0, 0],
         [0, 0, 0, 0]],
 
        [[2, 2, 2, 2],
         [2, 2, 2, 2],
         [2, 2, 2, 2]]]), array([[[0, 0, 0, 0],
         [2, 2, 2, 2],
         [4, 4, 4, 4]],
 
        [[0, 0, 0, 0],
         [2, 2, 2, 2],
         [4, 4, 4, 4]]]), array([[[0, 2, 4, 6],
         [0, 2, 4, 6],
         [0, 2, 4, 6]],
 
        [[0, 2, 4, 6],
         [0, 2, 4, 6],
         [0, 2, 4, 6]]])]
In [182]:
ls_2 = np.meshgrid(x, y, z, indexing='xy')
ls_2[0].shape
ls_2
Out[182]:
(3, 2, 4)
Out[182]:
[array([[[0, 0, 0, 0],
         [2, 2, 2, 2]],
 
        [[0, 0, 0, 0],
         [2, 2, 2, 2]],
 
        [[0, 0, 0, 0],
         [2, 2, 2, 2]]]), array([[[0, 0, 0, 0],
         [0, 0, 0, 0]],
 
        [[2, 2, 2, 2],
         [2, 2, 2, 2]],
 
        [[4, 4, 4, 4],
         [4, 4, 4, 4]]]), array([[[0, 2, 4, 6],
         [0, 2, 4, 6]],
 
        [[0, 2, 4, 6],
         [0, 2, 4, 6]],
 
        [[0, 2, 4, 6],
         [0, 2, 4, 6]]])]

例如用于函数在一定区间的求值

In [183]:
import matplotlib.pyplot as plt
x = np.arange(-5, 5, 0.1)
y = np.arange(-5, 5, 0.1)
xx, yy = np.meshgrid(x, y, sparse=True)
z = np.sin(xx**2 + yy**2) / (xx**2 + yy**2)
h = plt.contourf(x, y, z)
plt.show()

ndenumerate¶

Multidimensional index iterator.

多维数组迭代器

class numpy.ndenumerate(arr)
In [184]:
arr = np.arange(12).reshape(3, 4)
for index, x in np.ndenumerate(arr):
    print(index, x)
(0, 0) 0
(0, 1) 1
(0, 2) 2
(0, 3) 3
(1, 0) 4
(1, 1) 5
(1, 2) 6
(1, 3) 7
(2, 0) 8
(2, 1) 9
(2, 2) 10
(2, 3) 11

indices¶

Return an array representing the indices of a grid.

返回一个指定形状的坐标ndarray

numpy.indices(dimensions, dtype=<class 'int'>)

算是mgrid的方便版本,相比mgrid少了一点改动空间,但更加方便,结果的排列方式和mgrid也是一样的。

In [185]:
arr = np.arange(12).reshape(3, 4)

indice = np.indices((3, 4))
indice

arr[indice[0], indice[1]]
Out[185]:
array([[[0, 0, 0, 0],
        [1, 1, 1, 1],
        [2, 2, 2, 2]],

       [[0, 1, 2, 3],
        [0, 1, 2, 3],
        [0, 1, 2, 3]]])
Out[185]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

allclose与array_equal¶

allclose

Returns True if two arrays are element-wise equal within a tolerance.

如果两个 array 每一项误差都在可容忍范围内则返回 True

numpy.allclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)

默认两个 nan 是不相等的,可通过 equal_nan=True 设置

array_equal

True if two arrays have the same shape and elements, False otherwise.

如果两个ndarray的形状以及所有元素相等,则返回True

numpy.array_equal(a1, a2)
In [186]:
np.allclose([1e10, 1e-7], [1.00001e10, 1e-8])

np.allclose([1e10, 1e-8], [1.00001e10, 1e-9])

np.allclose([1e10, 1e-8], [1.0001e10, 1e-9])

np.allclose([1.0, np.nan], [1.0, np.nan])

np.allclose([1.0, np.nan], [1.0, np.nan], equal_nan=True)
Out[186]:
False
Out[186]:
True
Out[186]:
False
Out[186]:
False
Out[186]:
True
In [187]:
np.array_equal([1, 2], [1, 2])

np.array_equal(np.array([1, 2]), np.array([1, 2]))
Out[187]:
True
Out[187]:
True

ndarray 和 matrix¶

参考:What are the differences between numpy arrays and matrices? Which one should I use?

matrix 是严格 2 维的,而 ndarray 可以是 n 维的,matrix 是 ndarray 的一个子集,拥有全部 ndarray 的方法。matrix 主要的好处是可以方便的进行矩阵乘法,a*b 操作即为矩阵乘法

In [188]:
a = np.mat('4 3; 2 1')
a

b = np.mat('1 2; 3 4')
a

a*b
Out[188]:
matrix([[4, 3],
        [2, 1]])
Out[188]:
matrix([[4, 3],
        [2, 1]])
Out[188]:
matrix([[13, 20],
        [ 5,  8]])

不过在 Python 3.5 以后的版本,NumPy 也支持对 ndaaray 的 @ 操作符,同样也是矩阵乘法

In [189]:
a@b
Out[189]:
matrix([[13, 20],
        [ 5,  8]])

matrix 和 ndarray 都有 .T 方法,但是 matrix 还有 .I 逆矩阵和 .H 共轭矩阵方法,由于 * 操作符功能的不同, ** 操作符的功能也不一样

可通过 np.asmatrix 和 np.asarray 相互转换两种类型

reshape后自动降维¶

ndarray会在切片选择时自动将长度为1的维度隐去,比如(n, m)形状的ndarray取一列,shape 自动变为 (n,)而不是(n,1)

这两种形状在进行矩阵运算时会产生一些不可预知的问题。为了保持情况可控,最好将形状为(n,)的ndarray先reshape为 (n,1) 或 (1,n)

In [190]:
arr = np.arange(12).reshape((3, 4))

sliced = arr[0, :]
sliced.shape  # 不是(1,3)
sliced.T.shape
Out[190]:
(4,)
Out[190]:
(4,)
In [191]:
arr_mat = np.asmatrix(arr)
arr_mat

sliced_mat = arr_mat[0, :]
sliced_mat.shape
sliced_mat.T.shape
Out[191]:
matrix([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
Out[191]:
(1, 4)
Out[191]:
(4, 1)

shape 为 (n,)的 ndarray进行计算可能会出现不可预测的结果

In [192]:
sliced@np.ones((4, 1))

np.ones((1, 4))@sliced

try:
    sliced@np.ones((1, 4))
except Exception as e:
    print(e)

try:
    np.ones((4, 1))@sliced
except Exception as e:
    print(e)
Out[192]:
array([6.])
Out[192]:
array([6.])
matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 4)
matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 4 is different from 1)

(n, )和(n, 1)的广播原则¶

(n, )在大多数时候类似于(1,n)

In [193]:
arr = np.arange(3)
arr

mask = np.arange(9).reshape((3, 3)) > 5
mask
Out[193]:
array([0, 1, 2])
Out[193]:
array([[False, False, False],
       [False, False, False],
       [ True,  True,  True]])
In [194]:
arr * mask

mask*arr
Out[194]:
array([[0, 0, 0],
       [0, 0, 0],
       [0, 1, 2]])
Out[194]:
array([[0, 0, 0],
       [0, 0, 0],
       [0, 1, 2]])
In [195]:
arr.reshape(3, 1)*mask
Out[195]:
array([[0, 0, 0],
       [0, 0, 0],
       [2, 2, 2]])

两者给出的结果完全不同

为了保证 ndarray 维度可控,不要使用类似 (5,) 形状的 ndarray。

解决方案:

  • 在可能出现 1维 ndarray 的地方增加一个 reshape(n, 1)操作,必要的时候放一个 assert 语句保证不出错
  • 使用 keepdims 参数,不过在切片时似乎并不能使用这一参数
In [196]:
sum_ = np.sum(arr, axis=0)
sum_
sum_.shape

sum__ = np.sum(arr, axis=0, keepdims=True)
sum__
sum__.shape
Out[196]:
3
Out[196]:
()
Out[196]:
array([3])
Out[196]:
(1,)

tile和repeat¶

tile

Construct an array by repeating A the number of times given by reps.

整体重复输入的ndarray

numpy.tile(A, reps)

repeat

Repeat elements of an array.

按元素重复,默认axis=None

numpy.repeat(a, repeats, axis=None)
In [197]:
arr = np.arange(12).reshape((3, 4))

np.tile(arr, 2)

np.tile(arr, (2, 2))

np.tile(arr, (2, 1, 2))
Out[197]:
array([[ 0,  1,  2,  3,  0,  1,  2,  3],
       [ 4,  5,  6,  7,  4,  5,  6,  7],
       [ 8,  9, 10, 11,  8,  9, 10, 11]])
Out[197]:
array([[ 0,  1,  2,  3,  0,  1,  2,  3],
       [ 4,  5,  6,  7,  4,  5,  6,  7],
       [ 8,  9, 10, 11,  8,  9, 10, 11],
       [ 0,  1,  2,  3,  0,  1,  2,  3],
       [ 4,  5,  6,  7,  4,  5,  6,  7],
       [ 8,  9, 10, 11,  8,  9, 10, 11]])
Out[197]:
array([[[ 0,  1,  2,  3,  0,  1,  2,  3],
        [ 4,  5,  6,  7,  4,  5,  6,  7],
        [ 8,  9, 10, 11,  8,  9, 10, 11]],

       [[ 0,  1,  2,  3,  0,  1,  2,  3],
        [ 4,  5,  6,  7,  4,  5,  6,  7],
        [ 8,  9, 10, 11,  8,  9, 10, 11]]])
In [198]:
np.repeat(3, 4)
Out[198]:
array([3, 3, 3, 3])
In [199]:
np.repeat(arr, 2)
np.repeat(arr, 3, axis=1)
np.repeat(arr, [1, 2, 3], axis=0)  # 对应分别重复多少次
Out[199]:
array([ 0,  0,  1,  1,  2,  2,  3,  3,  4,  4,  5,  5,  6,  6,  7,  7,  8,
        8,  9,  9, 10, 10, 11, 11])
Out[199]:
array([[ 0,  0,  0,  1,  1,  1,  2,  2,  2,  3,  3,  3],
       [ 4,  4,  4,  5,  5,  5,  6,  6,  6,  7,  7,  7],
       [ 8,  8,  8,  9,  9,  9, 10, 10, 10, 11, 11, 11]])
Out[199]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 8,  9, 10, 11],
       [ 8,  9, 10, 11]])

只有一个数字的ndarray¶

行为比较奇怪

首先是比较正常的(1,)ndarray

In [200]:
arr = np.array([12])
arr.shape
arr
Out[200]:
(1,)
Out[200]:
array([12])

看看通过传入一个数字创建的ndarray

In [201]:
arr = np.array(12)
arr
Out[201]:
array(12)
In [202]:
type(arr)
Out[202]:
numpy.ndarray
In [203]:
arr.shape
Out[203]:
()

为什么没有shape?

In [204]:
type(arr+3)
Out[204]:
numpy.int32

运算操作后变为数字

In [205]:
try:
    num = arr[0]
except Exception as e:
    print(e)
too many indices for array

甚至无法索引

NumPy性能对比¶

对比先挖坑再填坑,与先生成再组合两种方式的性能

In [206]:
def func1():
    P = np.empty((100, 2))
    P[:, 0] = np.arange(100)
    P[:, 1] = np.arange(100)
    return P


def func2():
    x = np.arange(100)
    y = np.arange(100)
    P = np.vstack([x, y])
    return P


%timeit func1()
%timeit func2()
2.84 µs ± 332 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
4.95 µs ± 241 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

可见,先挖好坑之后往里填比较快

类似的,对比 append 操作,是列表比较快还是 ndarray 比较好?

In [207]:
ls = []


def func1():
    for i in range(100):
        ls.append(i)
    return ls


def func2():
    x = np.array([])
    for i in range(100):
        x = np.append(x, i)
    return x


%timeit func1()
%timeit func2()
8.56 µs ± 449 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
296 µs ± 8.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

对于经常需要改变 size 的操作,list 明显好于 ndarray

究其原因,NumPy通过划分一整块内存,通过高效的索引对这块内存进行管理,如果频繁进行append操作,那么NumPy就需要不停的重新分配内存,也就丧失了其高效的本质。

NumPy交换数据和比较操作¶

涉及到一些复制与引用的区别

交换¶

Python 列表交换数据很简单,可以直接采用方便的写法

In [208]:
a = [1, 2, 3, 4]
b = [5, 6, 7, 8]

a[1:3], b[1:3] = b[1:3], a[1:3]
a
b
Out[208]:
[1, 6, 7, 4]
Out[208]:
[5, 2, 3, 8]

ndarray这样写会出问题

In [209]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

a[1:3], b[1:3] = b[1:3], a[1:3]
a
b
Out[209]:
array([1, 6, 7, 4])
Out[209]:
array([5, 6, 7, 8])

由于执行顺序的问题,不能像原生 Python 中那样交换变量

实际上,无法完成交换是因为作为右值的切片返回视图,再加上Python的执行顺序问题,所以才无法完成交换。将赋值操作符右边替换为高级索引(返回副本)即可。

In [210]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

a[1:3], b[1:3] = b[[1, 2]], a[[1, 2]]
a
b
Out[210]:
array([1, 6, 7, 4])
Out[210]:
array([5, 2, 3, 8])

另一种解决方案,使用 copy

In [211]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

a[1:3], b[1:3] = b[1:3].copy(), a[1:3].copy()
a
b
Out[211]:
array([1, 6, 7, 4])
Out[211]:
array([5, 2, 3, 8])

比较¶

In [212]:
a == b
Out[212]:
array([False, False, False, False])

NumPy 的比较是 itemwise 的,所以用 numpy.array_equal() 或 numpy.allclose() 替代

In [213]:
np.array_equal(a, b)
Out[213]:
False

NumPy中的reshape操作¶

NumPy中的形状改变操作,有一些很tricky的地方

In [214]:
arr = np.ones((4, 4))

sliced = arr[:3, :]
sliced.shape
Out[214]:
(3, 4)

这样reshape没问题

In [215]:
sliced.shape = 4, 3
sliced
Out[215]:
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])
In [216]:
sliced_1 = arr[:, :3]
sliced_1.shape
Out[216]:
(4, 3)

在切片基础上,再改变形状?

In [217]:
try:
    sliced_1.shape = 3, 4
except AttributeError as e:
    print(e)
incompatible shape for a non-contiguous array

这时切片无法改变形状,这主要和ndarray在内存中的存储形式有关,在初始化ndarray的时候,里面的数据就按顺序排好了,而切片取前三列后(已经在内存中跳跃取值了)如果想进行改变形状的操作,就需要在内存中再次跳跃,此时已经无法借助strid来标记这一视图在原始数据块中的数据取用方式了,所以无法改变ndarray的形状。

ndarray的内存管理见 http://www.labri.fr/perso/nrougier/from-python-to-numpy/#anatomy-of-an-array

使用resize进行形状操作,同样报错

In [218]:
try:
    sliced_1.resize(3, 4, refcheck=False)
except ValueError as e:
    print(e)
resize only works on single-segment arrays

resieze给出的提示更加清晰 only works on single-segment arrays 即因为数据分段了,无法通过strides标记数据的取用规则(太复杂)

索引(indexing)与高级索引(advanced indexing, fancy indexing)的区别及特殊情况分析¶

先来看看官方的解释 https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-indexing

Advanced indexing is triggered when the selection object, obj, is a non-tuple sequence object, an ndarray (of data type integer or bool), or a tuple with at least one sequence object or ndarray (of data type integer or bool). There are two types of advanced indexing: integer and Boolean.

Advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view).

所谓高级索引(advanced indexing, fancy indexing),这是一个与普通索引(indexing)相区别的概念。所谓高级索引,区别于上面的普通索引或切片概念,即对ndarray进行索引时,只要某一维度上的索引值出现了诸如[1,2,4]或[True, True, False]之类的通过明确序列给定的索引(而不是1:5:2这类复合start:stop:step范式的索引)时,那么就算作高级索引。与之相对应的,普通索引大致上等同于切片这种索引方式,所谓切片,即对ndarray进行索引时,每个维度上的索引值都是类似start:stop:step这样格式的,比如arr[0:4:2, 1:5:1]或者arr[1, :-1]。

切片或者普通索引总是返回视图,高级索引总是返回副本。这一区别仅针对作为右值而言,无论是普通索引还是高级索引,作为左值时对其进行赋值都将影响原ndarray

切片或者普通索引总是返回视图,高级索引总是返回副本。这一区别仅针对作为右值而言,无论是普通索引还是高级索引,作为左值时对其进行赋值都将影响原ndarray

切片或者普通索引总是返回视图,高级索引总是返回副本。这一区别仅针对作为右值而言,无论是普通索引还是高级索引,作为左值时对其进行赋值都将影响原ndarray

切片与高级索引同时出现的场景¶

例如arr[1:4, [2,3,4]]这种。实际上,这种同时出现的场景分两类:

  • 高级索引项在第一维出现的情形
  • 高级索引项在之后出现的情形

先说结论,这种情况得到的结果都是副本,应该都归为高级索引一类,因为出现了通过明确序列给定的索引。注意,此时已然需要遵循高级索引的规则,如果出现多个明确序列,其形状需要一致

高级索引项出现在第一维¶

此时,得到的ndarray为原ndarray的部分元素的副本,且自身为base,即对数据拥有所有权。

In [219]:
arr = np.arange(125).reshape((5, 5, 5))

s = arr[[1, 2, 4], 1:5, [2, 3, 4]]
s

s.flags.owndata
Out[219]:
array([[ 32,  37,  42,  47],
       [ 58,  63,  68,  73],
       [109, 114, 119, 124]])
Out[219]:
True

高级索引项出现在第一维之后的维度¶

此时,得到的ndarray为原ndarray的部分元素的副本,但自身不是base,即对数据没有所有权,其base是某个作为中间变量的ndarray。

In [220]:
arr = np.arange(125).reshape((5, 5, 5))

s = arr[1:5, [1, 2, 4], [2, 3, 4]]
s
Out[220]:
array([[ 32,  38,  49],
       [ 57,  63,  74],
       [ 82,  88,  99],
       [107, 113, 124]])
In [221]:
s.flags.owndata  # 对数据没有所有权

s.base is (arr if arr.flags.owndata else arr.base)  # base也不是arr
Out[221]:
False
Out[221]:
False
In [222]:
# s的形状与其base不一样,且s的base也并不是arr
s.base.shape
s.shape
Out[222]:
(3, 4)
Out[222]:
(4, 3)

作为索引的单个整数(以及只有一个整数的列表)¶

首先明确一点,对ndarray进行索引时,某个维度上的索引值是单个整数,此时单个整数应该看做start:stop:step的一种特殊形式。但是需要注意的是,虽然arr[1, 2] arr[1:2, 2]得到的数据内容是相同的,但差异在于结果的维度。

In [223]:
arr = np.ones((5, 5, 5))

arr[1, 2].shape
arr[1:2, 2].shape
Out[223]:
(5,)
Out[223]:
(1, 5)

只有一个整数的列表作为索引时算高级索引

作为索引的1和[1]是不同的

In [224]:
arr = np.arange(125).reshape((5, 5, 5))

s = arr[2, 1:3]
s.flags.owndata
s.base is (arr if arr.flags.owndata else arr.base)  # 切片,返回视图
Out[224]:
False
Out[224]:
True
In [225]:
s = arr[[2], 1:3]
s.flags.owndata  # 高级索引,返回副本
Out[225]:
True

特殊的切片情形¶

我们所关心的特殊情形,是所有给出的切片维度上都为单个整数的情形,分为两种情形。

  • 切片维度少于原始ndarray维度时
  • 切片维度等于原始ndarray维度时

若切片维度少于原始ndarray维度,就算某些维度的索引只是单个数字,也很好理解其仍为切片¶

In [226]:
arr = np.ones((5, 5, 5))

s = arr[1, 2]  # 切片维度和原ndarray维度不同
s  # 返回 ndarray

s.flags.owndata  # 自身不具有底层数据

s.base is (arr if arr.flags.owndata else arr.base)  # 是arr的视图
Out[226]:
array([1., 1., 1., 1., 1.])
Out[226]:
False
Out[226]:
True

注意结果的维度

In [227]:
s = arr[1:2, 2:3]  # 和上面的写法没区别
s  # 返回 ndarray

s.flags.owndata  # 自身不具有底层数据

s.base is (arr if arr.flags.owndata else arr.base)  # 是arr的视图
Out[227]:
array([[[1., 1., 1., 1., 1.]]])
Out[227]:
False
Out[227]:
True

但当切片维度等于原始ndarray时,分两种情况。¶

  • 仍然按切片写法的情况(实际上只切了一个数据点)此时返回视图,为只有一个元素的ndarray
In [228]:
s = arr[1:2, 2:3, 3:4]
s  # 只有一个元素的ndarray

s.flags.owndata  # 自身不具有底层数据

s.base is (arr if arr.flags.owndata else arr.base)  # 是arr的视图
Out[228]:
array([[[1.]]])
Out[228]:
False
Out[228]:
True
  • 使用整数写法,每个维度的索引都为单个整数

此时返回原ndarray中的单个元素(副本, 不是普通Python元素),此种情况因为习惯,依然算作普通索引。(本因算作副本)

In [229]:
s = arr[1, 2, 3]  # 切片维度和原ndarray维度一样
s  # 返回单个元素
type(s)  # 类型为 np类型

s.flags.owndata
s.base is (arr if arr.flags.owndata else arr.base)  # 是arr的视图
Out[229]:
1.0
Out[229]:
numpy.float64
Out[229]:
True
Out[229]:
False

其他需要特殊区分的场景¶

注意:arr[(1,2,3),]和arr[(1,2,3)]有本质上的不同,后者等价于arr[1,2,3]即切片的特殊情况,而前者触发高级索引。后者对前三个维度进行索引,前者对第一维度进行索引。

此外,还需注意arr[[1,2,3]]等价于arr[[1,2,3],]触发高级索引,而arr[[1,2,slice(None)]]只是切片。

前述高级索引技巧内容中提到,NumPy会自动剥离索引最外层的列表包裹,类似arr[(1,2,3)]会等价于arr[1,2,3],但特殊情况是,arr[[1,2,3]]并不等价于arr[1,2,3],而arr[[1,2,slice(None)]]依然等价于arr[1,2]或者arr[1,2,...]。

在arr[(1,2,3)]和arr[[1,2,slice(None)]]索引的外层仍然被剥离的情况下,arr[[1,2,3]]是需要特别注意的特殊情况。

在arr[(1,2,3)]和arr[[1,2,slice(None)]]索引的外层仍然被剥离的情况下,arr[[1,2,3]]是需要特别注意的特殊情况。

在arr[(1,2,3)]和arr[[1,2,slice(None)]]索引的外层仍然被剥离的情况下,arr[[1,2,3]]是需要特别注意的特殊情况。

In [230]:
arr = np.ones((5, 5, 5, 5))

arr[(1, 2, 3), ].flags.owndata  # 触发高级索引

arr[(1, 2, 3)]
arr[(1, 2, 3)].flags.owndata  # 切片

arr[1, 2, 3]
arr[1, 2, 3].flags.owndata  # 切片
Out[230]:
True
Out[230]:
array([1., 1., 1., 1., 1.])
Out[230]:
False
Out[230]:
array([1., 1., 1., 1., 1.])
Out[230]:
False
In [231]:
arr[[1, 2, 3]].flags.owndata  # 对第一维度索引

np.allclose(arr[[1, 2, 3]], arr[[1, 2, 3], ])
Out[231]:
True
Out[231]:
True
In [232]:
arr[[1, 2, slice(None)]]  # 切片
arr[[1, 2, slice(None)]].flags.owndata

np.allclose(arr[[1, 2, slice(None)]], arr[1, 2])
c:\users\twang\appdata\local\conda\conda\envs\py36\lib\site-packages\ipykernel_launcher.py:1: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  """Entry point for launching an IPython kernel.
Out[232]:
array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])
c:\users\twang\appdata\local\conda\conda\envs\py36\lib\site-packages\ipykernel_launcher.py:2: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  
Out[232]:
False
c:\users\twang\appdata\local\conda\conda\envs\py36\lib\site-packages\ipykernel_launcher.py:4: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  after removing the cwd from sys.path.
Out[232]:
True

无论是普通索引还是高级索引,作为左值对其赋值,都将对原ndarray产生影响¶

In [233]:
arr = np.ones((5, 5))

arr[:2, :2] = 999  # 切片作为左值
arr
Out[233]:
array([[999., 999.,   1.,   1.,   1.],
       [999., 999.,   1.,   1.,   1.],
       [  1.,   1.,   1.,   1.,   1.],
       [  1.,   1.,   1.,   1.,   1.],
       [  1.,   1.,   1.,   1.,   1.]])
In [234]:
arr = np.ones((5, 5))

arr[[0, 1, 2], [0, 1, 2]] = 999  # 高级索引作为左值
arr
Out[234]:
array([[999.,   1.,   1.,   1.,   1.],
       [  1., 999.,   1.,   1.,   1.],
       [  1.,   1., 999.,   1.,   1.],
       [  1.,   1.,   1.,   1.,   1.],
       [  1.,   1.,   1.,   1.,   1.]])

获取指定行列交叉点上的数据¶

由于NumPy的高级索引机制,输入arr[[1,2,3], [2,3,4]]并不会得到1、2、3行与2,、3、4列交叉点上的数据,而只能得到arr[1,2]、arr[2,3]、arr[3,4]这三个值组成的ndarray。

In [235]:
arr = np.arange(25).reshape((5, 5))

arr[[1, 2, 3], [2, 3, 4]]
Out[235]:
array([ 7, 13, 19])

给出arr[[1,2,3], [2,3]]这样的表达式甚至会报错(各维度上的索引形状不统一)

In [236]:
try:
    arr[[1, 2, 3], [2, 3]]
except Exception as e:
    print(e)
shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (2,) 

为了得到指定行列交叉点上的数据,可以对每个维度分开进行索引

In [237]:
arr[[1, 2, 3], :][:, [2, -1]]
Out[237]:
array([[ 7,  9],
       [12, 14],
       [17, 19]])

或者这样,虽然感觉没什么用

利用bincount

In [238]:
arr = np.arange(25).reshape((5, 5))
indice_d1 = [1, 2, 3]
indice_d2 = [2, -1]

shape_d1, shape_d2 = arr.shape

# 为了获取完整维度上的计数序列,增加该维度可能的最大索引值
indice_d1.append(shape_d1-1)
indice_d2.append(shape_d2-1)

# 处理负数情形
indice_d1 = [(x+shape_d1) % shape_d1 for x in indice_d1]
indice_d2 = [(x++shape_d2) % shape_d2 for x in indice_d2]

# 转换为布尔序列,并去掉增加的一个计数
indice_d1 = np.bincount(indice_d1)
indice_d1[-1] -= 1
mask1 = (indice_d1 != 0)
indice_d2 = np.bincount(indice_d2)
indice_d2[-1] -= 1
mask2 = (indice_d2 != 0)

arr[mask1, :][:, mask2]
Out[238]:
array([[ 7,  9],
       [12, 14],
       [17, 19]])

ndarray的链式索引¶

作为右值¶

按之前的视图与副本规则判断即可

视图的视图还是视图

In [239]:
arr = np.arange(25).reshape((5, 5))

arr[1:3, 2:4][1, :]
arr[1:3, 2:4][1, :].flags.owndata
arr[1:3, 2:4][1, :].base is (arr if arr.flags.owndata else arr.base)
Out[239]:
array([12, 13])
Out[239]:
False
Out[239]:
True

视图的副本为副本

In [240]:
arr[1:3, 2:4][[1], :]
arr[1:3, 2:4][[1], :].flags.owndata
arr[1:3, 2:4][[1], :].base is (arr if arr.flags.owndata else arr.base)
Out[240]:
array([[12, 13]])
Out[240]:
True
Out[240]:
False

副本的视图,是某个中间变量(副本)的视图

In [241]:
arr[[1, 2, 3], 2:4][0, :]
arr[[1, 2, 3], 2:4][0, :].flags.owndata
arr[[1, 2, 3], 2:4][0, :].base is (arr if arr.flags.owndata else arr.base)
Out[241]:
array([7, 8])
Out[241]:
False
Out[241]:
False

作为左值¶

对视图的视图赋值,可影响原始ndarray

In [242]:
arr = np.arange(25).reshape((5, 5))

arr[1:3, 2:4][1, :] = 999
arr
Out[242]:
array([[  0,   1,   2,   3,   4],
       [  5,   6,   7,   8,   9],
       [ 10,  11, 999, 999,  14],
       [ 15,  16,  17,  18,  19],
       [ 20,  21,  22,  23,  24]])

对视图的副本赋值,可影响原始ndarray

In [243]:
arr = np.arange(25).reshape((5, 5))

arr[1:3, 2:4][[1], :] = 999
arr
Out[243]:
array([[  0,   1,   2,   3,   4],
       [  5,   6,   7,   8,   9],
       [ 10,  11, 999, 999,  14],
       [ 15,  16,  17,  18,  19],
       [ 20,  21,  22,  23,  24]])

对副本的视图赋值,无法影响原始ndarray

In [244]:
arr = np.arange(25).reshape((5, 5))

arr[[1, 2, 3], 2:4][0, :] = 999
arr
Out[244]:
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

我判断,应该是看最后一层索引之前是原始ndarray的副本还是视图,若是视图则无论最后一层是视图还是副本,都将对原始ndarray产生影响。若为副本,则无论最后一层是视图还是副本,都无法对原始ndarray产生影响。

验证一下

In [245]:
arr = np.arange(25).reshape((5, 5))

# 最后一层之前为视图,即使最后一层为副本也会对原ndarray产生影响
arr[:, :-1][:-1, :][[0, 1, 2], :] = 999
arr
Out[245]:
array([[999, 999, 999, 999,   4],
       [999, 999, 999, 999,   9],
       [999, 999, 999, 999,  14],
       [ 15,  16,  17,  18,  19],
       [ 20,  21,  22,  23,  24]])

ndarray内存排布的深入理解¶

From Python to Numpy Anatomy of An Array

引子¶

有如下ndarray

In [246]:
arr = np.ones(4*1000000, np.float32)

需要将其全部赋值为8,怎么做?

In [247]:
%timeit arr[...] = 8
836 µs ± 52.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [248]:
%timeit arr.view(np.float16)[...] = 8
%timeit arr.view(np.int16)[...] = 8
%timeit arr.view(np.int32)[...] = 8
%timeit arr.view(np.float32)[...] = 8
%timeit arr.view(np.int64)[...] = 8
%timeit arr.view(np.float64)[...] = 8
%timeit arr.view(np.complex128)[...] = 8
%timeit arr.view(np.int8)[...] = 8
867 µs ± 58.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
811 µs ± 9.41 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
811 µs ± 8.85 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
807 µs ± 6.69 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
813 µs ± 7.55 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
805 µs ± 4.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
1.31 ms ± 24.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
818 µs ± 6.79 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

运行时间似乎相差不大,但这种差异展示了NumPy对ndarray的内存管理的哲学。

内存排布¶

NumPy文档对ndarray的定义

An instance of class ndarray consists of a contiguous one-dimensional segment of computer memory (owned by the array, or by some other object), combined with an indexing scheme that maps N integers into the location of an item in the block.

连续的1-D内存块,搭配某种索引机制,将由整数组成的索引定位到内存块中对应的位置。

索引机制则由,形状(shape)和数据类型(data type)定义,在定义新的ndarray时也仅需这两者。

In [249]:
Z = np.arange(9).reshape(3, 3).astype(np.int16)
Z
Out[249]:
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]], dtype=int16)

Z的itemsize是2 bytes (int16), 形状为(3,3),维度为2 (len(Z.shape)).

In [250]:
Z.itemsize

Z.shape

Z.ndim
Out[250]:
2
Out[250]:
(3, 3)
Out[250]:
2

此外,因为Z并不是其他ndarray的视图,故可以推断出strides,即当遍历ndarray时,在每个维度上每次需要跨越多少内存。

In [251]:
strides = Z.shape[1]*Z.itemsize, Z.itemsize
strides

Z.strides
Out[251]:
(6, 2)
Out[251]:
(6, 2)

基于以上信息,可以确定如何索引ndarray的某个具体元素。

用tobytes方法检验一下

In [252]:
Z = np.arange(9).reshape(3, 3).astype(np.int16)
index = 1, 1
Z[index].tobytes()

offset_start = 0
for i in range(Z.ndim):
    offset_start += Z.strides[i]*index[i]

Z.tobytes()[offset_start:offset_start + Z.itemsize]
Out[252]:
b'\x04\x00'
Out[252]:
b'\x04\x00'

可视化ndarray内存排布¶

元素排布


一维化元素排布


内存排布(C order Big endian)


对Z进行一次切片,此时视图必须同时由形状,data type 以及 strides 才能确定,因为无法从形状和data type对strides进行推断。

In [253]:
V = Z[::2, ::2]
V
Out[253]:
array([[0, 2],
       [6, 8]], dtype=int16)

元素排布


一维化元素排布


内存排布(C order Big endian)


视图与副本¶

直接与间接访问¶

首先需要区分普通索引与高级索引的概念,因为普通索引总是返回视图,而高级索引总是返回副本。

有些函数总返回副本,如flatten,而有些总返回视图,如ravel。

中间变量¶

通常在执行数学运算时,会产生中间变量。

In [254]:
X = np.ones(10, dtype=np.int)
Y = np.ones(10, dtype=np.int)
A = 2*X + 2*Y  # 产生三个中间变量
A
Out[254]:
array([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])

使用函数替代运算符,指定输出ndarray可以避免中间变量的产生。

In [255]:
X = np.ones(10, dtype=np.int)
Y = np.ones(10, dtype=np.int)
np.multiply(X, 2, out=X)
np.multiply(Y, 2, out=Y)
np.add(X, Y, out=X)  # 不产生中间变量,就地操作
X
Out[255]:
array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
Out[255]:
array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
Out[255]:
array([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
Out[255]:
array([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])

总结任务¶

作为总结,考虑一个任务,如果确定某个ndarray是不是另一个ndarray的视图?如果是,其start,stop,step分别是多少?

In [256]:
Z1 = np.arange(10)
Z2 = Z1[1:-1:2]


首先确定,Z2是否为Z1的视图,这个问题很好确定,只需查看Z2的base是否为Z1即可。

In [257]:
Z2.base is Z1
Out[257]:
True

确定了Z2是Z1的视图,如何确定start,stop,step?step的确定实际上也比较简单,利用strides属性即可。

In [258]:
step = Z2.strides[0] // Z1.strides[0]  # 注意,这里整除

step
Out[258]:
2

对于start和stop,利用byte_bounds方法,该方法返回指向ndarray第一个以及最后一个元素后方的指针。


In [259]:
np.byte_bounds(Z1)
Out[259]:
(2241121499232, 2241121499272)
In [260]:
offset_start = np.byte_bounds(Z2)[0] - np.byte_bounds(Z1)[0]
offset_start  # bytes


offset_stop = np.byte_bounds(Z2)[-1] - np.byte_bounds(Z1)[-1]
offset_stop  # bytes
Out[260]:
4
Out[260]:
-8

基于此,便可计算start,stop,step

In [261]:
start = offset_start // Z1.itemsize
stop = Z1.size + offset_stop // Z1.itemsize

start, stop, step
Out[261]:
(1, 8, 2)

验证结果正确性

In [262]:
np.allclose(Z1[start:stop:step], Z2)
Out[262]:
True

进阶任务¶

以上任务仅针对一维ndarray而言,且未考虑负数step的情况,考虑多维ndarray以及负数step则

In [263]:
Z1 = np.ones((5, 5))
Z2 = Z1[::2, ::2]
Z2
Out[263]:
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])
In [264]:
itemsize = Z2.itemsize
offset_start = (np.byte_bounds(Z2)[0] - np.byte_bounds(Z1)[0])//itemsize
offset_stop = (np.byte_bounds(Z2)[-1] - np.byte_bounds(Z1)[-1]-1)//itemsize
index_start = np.unravel_index(offset_start, Z1.shape)
index_stop = np.unravel_index(Z1.size+offset_stop, Z1.shape)
index_step = np.array(Z2.strides)//np.array(Z1.strides)
index_start, index_stop, index_step
Out[264]:
((0, 0), (4, 4), array([2, 2], dtype=int32))
In [265]:
index = np.empty((Z1.ndim, 3)).astype(np.int16)

for i in range(len(index_step)):
    start = index_start[i]
    stop = index_stop[i]
    step = index_step[i]

    if step < 0:
        start, stop = stop, start - 1
    else:
        start, stop = start, stop + 1

    index[i] = start, stop, step


index
Out[265]:
array([[0, 5, 2],
       [0, 5, 2]], dtype=int16)

  • « 新年Flag
  • Pandas入门 »

Published

1 2, 2019

Category

posts

Tags

  • NumPy 2
  • Python 16

Contact

  • Zodiac Wang - A Fantastic Learner
  • Powered by Pelican. Theme: Elegant