1 回答
TA贡献1805条经验 获得超10个赞
是的,这是可能的。这个想法是用来np.repeat创建一个向量,其中的项目重复可变次数。这是代码:
# The two following lines can be done only once if the indices are constant between iterations (precomputation)
counts = np.array([len(e) for e in vertex_neighbors])
flatten_indices = np.concatenate(vertex_neighbors)
E_int = np.sum(np.repeat(new_vertices, counts, axis=0) - new_vertices[flatten_indices])
这是一个基准:
import numpy as np
from time import *
n = 32768
vertices = np.random.rand(n, 3)
indices = []
count = np.random.randint(1, 10, size=n)
for i in range(n):
indices.append(np.random.randint(0, n, size=count[i]))
def initial_version(vertices, vertex_neighbors):
sommes = []
for j in range(vertices.shape[0]):
terme = vertices[j] - vertices[vertex_neighbors[j]]
somme_j = np.sum(terme)
sommes.append(somme_j)
return np.sum(sommes)
def optimized_version(vertices, vertex_neighbors):
# The two following lines can be precomputed
counts = np.array([len(e) for e in indices])
flatten_indices = np.concatenate(indices)
return np.sum(np.repeat(vertices, counts, axis=0) - vertices[flatten_indices])
def more_optimized_version(vertices, vertex_neighbors, counts, flatten_indices):
return np.sum(np.repeat(vertices, counts, axis=0) - vertices[flatten_indices])
timesteps = 20
a = time()
for t in range(timesteps):
res = initial_version(vertices, indices)
b = time()
print("V1: time:", b - a)
print("V1: result", res)
a = time()
for t in range(timesteps):
res = optimized_version(vertices, indices)
b = time()
print("V2: time:", b - a)
print("V2: result", res)
a = time()
counts = np.array([len(e) for e in indices])
flatten_indices = np.concatenate(indices)
for t in range(timesteps):
res = more_optimized_version(vertices, indices, counts, flatten_indices)
b = time()
print("V3: time:", b - a)
print("V3: result", res)
这是我机器上的基准测试结果:
V1: time: 3.656714916229248
V1: result -395.8416223057596
V2: time: 0.19800186157226562
V2: result -395.8416223057595
V3: time: 0.07983255386352539
V3: result -395.8416223057595
正如您所看到的,这一优化版本比参考实现快 18 倍,而预先计算索引的版本比参考实现快 46 倍。
请注意,优化版本应该需要更多 RAM(特别是如果每个顶点的邻居数量很大)。
添加回答
举报