首页手记【稀疏矩阵】使用torch.sparse模块

【稀疏矩阵】使用torch.sparse模块

标签：

Python 人工智能

@TOC

稀疏矩阵的格式

目前，torch.sparse和scipy.sparse模块比较支持的主流的稀疏矩阵格式有coo格式、csr格式和csc格式，这三种格式中可供使用的API也最多。

coo

将矩阵中非零元素的坐标和值分开存储在3个数组中，3个数组长度必须相同，表示有n个非零元素。

csr

分 Index Pointers、Indices、Data3个数组存储。

Index Pointers：第 i个元素记录这个矩阵的第 i行的第1个非零值在 Data数组的起始位置，第 i+1个元素记录这个矩阵的第 i行的最后一个非零值在 Data数组的终止位置（不包含右边界）。因此，这个矩阵的行数等于 len(Index Pointers)-1，第 i行非零值的个数等于 Index Pointers[i+1]-Index Pointers[i]。
Indices：第 i个元素记录这个矩阵的第 i个非零值的列坐标。
Data：第 i个元素记录这个矩阵的第 i个非零值的具体数值，排列顺序严格按照行优先，列次先。

csc

与csr唯一的不同在于列优先，其他规则一模一样。

Construction of Sparse COO tensors

常规构建


>>> i = [[0, 1, 1],

[2, 0, 2]]

>>> v = [3, 4, 5]

>>> s = torch.sparse_coo_tensor(i, v, (2, 3))

>>> s

tensor(indices=tensor([[0, 1, 1],

[2, 0, 2]]),

values=tensor([3, 4, 5]),

size=(2, 3), nnz=3, layout=torch.sparse_coo)

>>> s.to_dense()

tensor([[0, 0, 3],

[4, 0, 5]])

torch中，稀疏矩阵的存储方式记录在 tensor.layout中，可以通过检查 torch.layout == torch.sparse_coo来判断是否是coo张量。此外，稠密张量的 layout等于 strided。

稠密混合的coo张量


>>> i = [[0, 1, 1],

[2, 0, 2]]

>>> v = [[3, 4], [5, 6], [7, 8]]

>>> s = torch.sparse_coo_tensor(i, v, (2, 3, 2))

>>> s

tensor(indices=tensor([[0, 1, 1],

[2, 0, 2]]),

values=tensor([[3, 4],

[5, 6],

[7, 8]]),

size=(2, 3, 2), nnz=3, layout=torch.sparse_coo)

此方案与常规的coo构建方式不同，values中每个元素可以是一个向量，表示对应坐标的稠密张量，因此，创建出的coo张量也多出了一个维度。

带有重复坐标的coo张量


>>> i = [[1, 1]]

>>> v = [3, 4]

>>> s=torch.sparse_coo_tensor(i, v, (3,))

>>> s

tensor(indices=tensor([[1, 1]]),

values=tensor( [3, 4]),

size=(3,), nnz=2, layout=torch.sparse_coo)

>>> s.to_dense()

tensor([0, 7, 0])

如果输入的坐标有重复，则创建出的coo张量会自动把坐标重复的元素值相加。此外，可以通过成员函数 .coalesce()把重复坐标的元素值相加，将这个coo转换成一个不重复的张量；也可以通过 .is_coalesced()检查这个coo是否存在重复的坐标。

Construction of CSR tensors

按照 Index Pointers、Indices、Data三个数组的定义构建即可。


>>> crow_indices = torch.tensor([0, 2, 4])

>>> col_indices = torch.tensor([0, 1, 0, 1])

>>> values = torch.tensor([1, 2, 3, 4])

>>> csr = torch.sparse_csr_tensor(crow_indices, col_indices, values, dtype=torch.float64)

>>> csr

tensor(crow_indices=tensor([0, 2, 4]),

col_indices=tensor([0, 1, 0, 1]),

values=tensor([1., 2., 3., 4.]), size=(2, 2), nnz=4,

dtype=torch.float64)

>>> csr.to_dense()

tensor([[1., 2.],

[3., 4.]], dtype=torch.float64)

Linear Algebra operations（稀疏与稠密之间混合运算）

M表示2-D张量，V表示1-D张量，f表示标量，*表示逐元素乘法，@表示矩阵乘法。M[SparseSemiStructured]表示一种半结构化的稀疏矩阵，此处不再展开，可以自行去torch官网察看。

| PyTorch operation | Sparse grad | Layout signature |

| -------------------- | ----------- | ------------------------------------------------------------------------- |

| torch.mv() | no | M[sparse_coo] @ V[strided] -> V[strided] |

| torch.mv() | no | M[sparse_csr] @ V[strided] -> V[strided] |

| torch.matmul() | no | M[sparse_coo] @ M[strided] -> M[strided] |

| torch.matmul() | no | M[sparse_csr] @ M[strided] -> M[strided] |

| torch.matmul() | no | M[SparseSemiStructured] @ M[strided] -> M[strided] |

| torch.matmul() | no | M[strided] @ M[SparseSemiStructured] -> M[strided] |

| torch.mm() | no | M[strided] @ M[SparseSemiStructured] -> M[strided] |

| torch.mm() | no | M[sparse_coo] @ M[strided] -> M[strided] |

| torch.mm() | no | M[SparseSemiStructured] @ M[strided] -> M[strided] |

| torch.sparse.mm() | yes | M[sparse_coo] @ M[strided] -> M[strided] |

| torch.smm() | no | M[sparse_coo] @ M[strided] -> M[sparse_coo] |

| torch.hspmm() | no | M[sparse_coo] @ M[strided] -> M[hybrid sparse_coo] |

| torch.bmm() | no | T[sparse_coo] @ T[strided] -> T[strided] |

| torch.addmm() | no | f * M[strided] + f * (M[sparse_coo] @ M[strided]) -> M[strided] |

| torch.addmm() | no | f * M[strided] + f * (M[SparseSemiStructured] @ M[strided]) -> M[strided] |

| torch.addmm() | no | f * M[strided] + f * (M[strided] @ M[SparseSemiStructured]) -> M[strided] |

| torch.sparse.addmm() | yes | f * M[strided] + f * (M[sparse_coo] @ M[strided]) -> M[strided] |

| torch.sspaddmm() | no | f * M[sparse_coo] + f * (M[sparse_coo] @ M[strided]) -> M[sparse_coo] |

| torch.lobpcg() | no | GENEIG(M[sparse_coo]) -> M[strided], M[strided] |

| torch.pca_lowrank() | yes | PCA(M[sparse_coo]) -> M[strided], M[strided], M[strided] |

| torch.svd_lowrank() | yes | SVD(M[sparse_coo]) -> M[strided], M[strided], M[strided] |

以上API中，如果 Layout signature中提供了 @或者 *操作符，就不需要记住API，直接通过操作符即可隐式调用对应的API。如：


>>> a = torch.tensor([[0, 0, 1, 0], [1, 2, 0, 0], [0, 0, 0, 0]], dtype=torch.float64)

>>> sp = a.to_sparse_csr()

>>> vec = torch.randn(4, 1, dtype=torch.float64)

>>> sp.matmul(vec)

tensor([[ 0.4788],

[-3.2338],

[ 0.0000]], dtype=torch.float64)

>>> sp @ vec

tensor([[ 0.4788],

[-3.2338],

[ 0.0000]], dtype=torch.float64)

需要注意的是，使用操作符在稀疏张量和稠密张量之间乘法运算时，返回的都是稠密张量。如果想要返回稀疏张量，需要显式使用torch.smm()。

torch同样支持稀疏与稀疏之间的运算，但要求输入的稀疏张量必须具有相同的稀疏结构，否则会报错，返回的稀疏张量的稀疏结构也与输入相同。

乘法运算：


>>> a = torch.tensor([[0, 0, 1, 0], [1, 2, 0, 0], [0, 1, 0, 0], [1, 0, 0, 0]], dtype=torch.float64)

>>> b = torch.tensor([[0, 0, 2, 0], [3, 1, 0, 0], [0, 0, 4, 0], [1, 0, 0, 1]], dtype=torch.float64)

>>> sp1 = a.to_sparse_coo()

>>> sp2 = b.to_sparse_coo()

>>> sp1 @ sp2

tensor(indices=tensor([[0, 1, 1, 1, 2, 2, 3],

[2, 0, 1, 2, 0, 1, 2]]),

values=tensor([4., 6., 2., 2., 3., 1., 2.]),

size=(4, 4), nnz=7, dtype=torch.float64, layout=torch.sparse_coo)

加法运算


>>> a = torch.tensor([[0, 0, 1, 0], [1, 2, 0, 0], [0, 1, 0, 0], [1, 0, 0, 0]], dtype=torch.float64)

>>> b = torch.tensor([[0, 0, 2, 0], [3, 1, 0, 0], [0, 0, 4, 0], [1, 0, 0, 1]], dtype=torch.float64)

>>> sp1 = a.to_sparse_coo()

>>> sp2 = b.to_sparse_coo()

>>> sp3 = b.to_sparse_csr()

>>> sp1 + sp2

tensor(indices=tensor([[0, 1, 1, 2, 2, 3, 3],

[2, 0, 1, 1, 2, 0, 3]]),

values=tensor([3., 4., 3., 1., 4., 2., 1.]),

size=(4, 4), nnz=7, dtype=torch.float64, layout=torch.sparse_coo)

>>> sp1 + sp3

UserWarning: Sparse CSR tensor support is  in beta state. If you miss a functionality in the sparse tensor support, please submit a feature request to https://github.com/pytorch/pytorch/issues. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\SparseCsrTensorImpl.cpp:55.)

sp3 = b.to_sparse_csr()

Traceback (most recent call last):

File "C:\Users\Xu Han\Desktop\pycharm-projects\MD_notes\main.py", line 18, in <module>

print(sp1 + sp3)

RuntimeError: memory format option is only supported by strided tensors

Tensor methods and sparse（与稀疏有关的tensor成员函数）

| PyTorch operation | return |

| -------------------- | ------------------------------------------------------------------------- |

| Tensor.is_sparse | IsTrue if the Tensor uses sparse COO storage layout, False otherwise. |

| Tensor.is_sparse_csr | IsTrue if the Tensor uses sparse CSR storage layout, False otherwise. |

| Tensor.dense_dim | Return the number of dense dimensions in a sparse tensorself. |

| Tensor.sparse_dim | Return the number of sparse dimensions in a sparse tensorself. |

这里打断一下表格，讲解一下dense_dim和sparse_dim的含义。上文中，我们曾构建过稠密混合的coo张量，如下：


>>> i = [[0, 1, 1],

[2, 0, 2]]

>>> v = [[3, 4], [5, 6], [7, 8]]

>>> s = torch.sparse_coo_tensor(i, v, (2, 3, 2))

>>> s

tensor(indices=tensor([[0, 1, 1],

[2, 0, 2]]),

values=tensor([[3, 4],

[5, 6],

[7, 8]]),

size=(2, 3, 2), nnz=3, layout=torch.sparse_coo)

那么，对于这个tensor，它的dense_dim为1，sparse_dim为2。

此外，在进行稀疏与稀疏之间的数学运算时，一定要保证稀疏张量的sparse_dim等于2.

继续表格。

| PyTorch operation | return |

| -------------------- | ------------------------------------------------------------------------------------------------------------------------ |

| Tensor.sparse_mask | Returns a new sparse tensor with values from a strided tensorself filtered by the indices of the sparse tensor mask. |

| Tensor.to_sparse | Returns a sparse copy of the tensor. |

| Tensor.to_sparse_coo | Convert a tensor to coordinate format. |

| Tensor.to_sparse_csr | Convert a tensor to compressed row storage format (CSR). |

| Tensor.to_sparse_csc | Convert a tensor to compressed column storage (CSC) format. |

| Tensor.to_sparse_bsr | Convert a tensor to a block sparse row (BSR) storage format of given blocksize. |

| Tensor.to_sparse_bsc | Convert a tensor to a block sparse column (BSC) storage format of given blocksize. |

| Tensor.to_dense | Creates a strided copy ofself if self is not a strided tensor, otherwise returns self. |

| Tensor.values | Return the values tensor of a sparse COO tensor. |

以下是仅限coo张量的成员：

| PyTorch operation | return |

| ------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |

| Tensor.coalesce | Returns a coalesced copy ofself if self is an uncoalesced tensor. |

| Tensor.sparse_resize_ | Resizesself sparse tensor to the desired size and the number of sparse and dense dimensions. |

| Tensor.sparse_resize_and_clear_ | Removes all specified elements from a sparse tensorself and resizes self to the desired size and the number of sparse and dense dimensions. |

| Tensor.is_coalesced | ReturnsTrue if self is a sparse COO tensor that is coalesced, False otherwise. |

| Tensor.indices | Return the indices tensor of a sparse COO tensor. |

以下是仅限csr和bsr张量的成员：

| PyTorch operation | return |

| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |

| Tensor.crow_indices | Returns the tensor containing the compressed row indices of theself tensor when self is a sparse CSR tensor of layout sparse_csr. |

| Tensor.col_indices | Returns the tensor containing the column indices of theself tensor when self is a sparse CSR tensor of layout sparse_csr. |

以下是仅限csc和bsc张量的成员：

| PyTorch operation | return |

| ------------------- | ------ |

| Tensor.row_indices | … |

| Tensor.ccol_indices | … |

coo张量可用的tensor成员函数（经实测，csr也有一些可以用，比如dim()）

add() add_() addmm() addmm_() any() asin() asin_() arcsin() arcsin_() bmm() clone() deg2rad() deg2rad_() detach() detach_() dim() div() div_() floor_divide() floor_divide_() get_device() index_select() isnan() log1p() log1p_() mm() mul() mul_() mv() narrow_copy() neg() neg_() negative() negative_() numel() rad2deg() rad2deg_() resize_as_() size() pow() sqrt() square() smm() sspaddmm() sub() sub_() t() t_() transpose() transpose_() zero_()

Torch functions specific to sparse Tensors（与稀疏有关的torch函数）

| PyTorch operation | return |

| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------ |

| sparse_coo_tensor | Constructs a sparse tensor in COO(rdinate) format with specified values at the givenindices. |

| sparse_csr_tensor | Constructs a sparse tensor in CSR (Compressed Sparse Row) with specified values at the givencrow_indices and col_indices. |

| sparse_csc_tensor | Constructs a sparse tensor in CSC (Compressed Sparse Column) with specified values at the givenccol_indices and row_indices. |

| sparse_bsr_tensor | Constructs a sparse tensor in BSR (Block Compressed Sparse Row)) with specified 2-dimensional blocks at the givencrow_indices and col_indices. |

| sparse_bsc_tensor | Constructs a sparse tensor in BSC (Block Compressed Sparse Column)) with specified 2-dimensional blocks at the givenccol_indices and row_indices. |

| sparse_compressed_tensor | Constructs a sparse tensor in Compressed Sparse format - CSR, CSC, BSR, or BSC - with specified values at the givencompressed_indices and plain_indices. |

| sparse.sum | Return the sum of each row of the given sparse tensor. |

| sparse.addmm | This function does exact same thing as torch.addmm() in the forward, except that it supports backward for sparse COO matrixmat1. |

| sparse.sampled_addmm | Performs a matrix multiplication of the dense matricesmat1 and mat2 at the locations specified by the sparsity pattern of input. |

| sparse.mm | Performs a matrix multiplication of the sparse matrixmat1 |

| sspaddmm | Matrix multiplies a sparse tensormat1 with a dense tensor mat2, then adds the sparse tensor input to the result. |

| hspmm | Performs a matrix multiplication of a sparse COO matrixmat1 and a strided matrix mat2. |

| smm | Performs a matrix multiplication of the sparse matrixinput with the dense matrix mat. |

| sparse.softmax | Applies a softmax function. |

| sparse.log_softmax | Applies a softmax function followed by logarithm. |

| sparse.spdiags | Creates a sparse 2D tensor by placing the values from rows ofdiagonals along specified diagonals of the output |

支持稀疏张量的常规torch函数

cat() dstack() empty() empty_like() hstack() index_select() is_complex() is_floating_point() is_nonzero() is_same_size() is_signed() is_tensor() lobpcg() mm() native_norm() pca_lowrank() select() stack() svd_lowrank() unsqueeze() vstack() zeros() zeros_like()

支持稀疏张量的一元函数

The following operators currently support sparse COO/CSR/CSC/BSR tensor inputs.

abs() asin() asinh() atan() atanh() ceil() conj_physical() floor() log1p() neg() round() sin() sinh() sign() sgn() signbit() tan() tanh() trunc() expm1() sqrt() angle() isinf() isposinf() isneginf() isnan() erf() erfinv()

点击查看更多内容

为 TA 点赞

若觉得本文不错，就分享一下吧！

评论

评论

共同学习，写下你的评论

评论加载中...

展开查看更多评论

作者其他优质文章

正在加载中

UnderTurrets

学生

手记
篇

粉丝

1

获赞与收藏

2

关注作者，订阅最新文章

阅读免费教程

Python 办公自动化教程

17个小节 27554 930

Python 算法入门教程

15个小节 30303 1172

Python 进阶应用教程

38个小节 72992 1146

推荐

评论

收藏

共同学习，写下你的评论



感谢您的支持，我会继续努力的～

扫码打赏，你说多少就多少

赞赏金额会直接到老师账户

支付方式

打开微信扫一扫，即可进行扫码打赏哦

今天注册有机会得

100积分直接送

付费专栏免费学

大额优惠券免费领

立即参与放弃机会

点击
抽奖

慕课手记新用户专享福利

恭喜你，你的运气太好了，居然抽中了 100个积分！

恭喜你，抽中了价值元的专栏！

太棒了，直接落到你账户里！

积分商城里的罗技鼠标、机械键盘、
Kindle 阅读器、小米平衡车
Apple iPad （10.2英寸）、大额优惠券
在等着你去兑换了噢

作者：

免费赠送

兑换码：1111222211 复制

优惠券可用于购买实战课、体系课
无门槛使用

先去看看，有什么好东西马上兑换我爱学习，选课去


热搜

最近搜索清空

【稀疏矩阵】使用torch.sparse模块

稀疏矩阵的格式

coo

csr

csc

Construction of Sparse COO tensors

Construction of CSR tensors

Linear Algebra operations（稀疏与稠密之间混合运算）

Tensor methods and sparse（与稀疏有关的tensor成员函数）

coo张量可用的tensor成员函数（经实测，csr也有一些可以用，比如dim()）

Torch functions specific to sparse Tensors（与稀疏有关的torch函数）

支持稀疏张量的常规torch函数

支持稀疏张量的一元函数

阅读免费教程