2 回答
TA贡献1891条经验 获得超3个赞
这是一个有趣的问题,所以我决定深入调查您的主要顾虑。
# required modules line_profiler, matplotlib, seaborn abd scipy
import time as dt
from line_profiler import LineProfiler
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
success = True
can_test = True
def and_op():
for x in range(2000):
s = success and can_test
def or_op():
for x in range(2000):
s = success or can_test
or_op_list = []
for x in range(0,1000):
lp = LineProfiler()
lp_wrapper = lp(or_op)
lp_wrapper()
lstats = lp.get_stats()
total_time = 0
for v in lstats.timings.values():
for op in v:
total_time += op[-1]
final = op[-1]
operator = final/total_time
or_op_list.append(operator)
and_op_list = []
for x in range(0,1000):
lp = LineProfiler()
lp_wrapper = lp(and_op)
lp_wrapper()
lstats = lp.get_stats()
total_time = 0
for v in lstats.timings.values():
for op in v:
total_time += op[-1]
final = op[-1]
operator = final/total_time
and_op_list.append(operator)
sns.kdeplot(and_op_list, label = 'AND')
sns.kdeplot(or_op_list, label = 'OR')
plt.show()
print(stats.ttest_ind(and_op_list,or_op_list, equal_var = False))
pvalue=1.8293386245013954e-103
实际上,与“和”操作相比,“或”具有统计意义并且不同
TA贡献1856条经验 获得超5个赞
当我在我的机器上运行您的代码时,它有时打印出来的速度也True and True比 快True or True。
出现这种现象的原因是您dt.time()的代码以“微秒”(即 1000 纳秒)为尺度测量时间,但是,这个微秒尺度太稀疏,无法测量每次执行 or所花费的时间。在大多数情况下, or所花费的时间小于1 微秒。if success and can_test:if success or can_test:if success and can_test:if success or can_test:
因此,在下面的代码部分中:
for i in range(10000000):
start = dt.time()
if success and can_test: # a dust particle
stop = dt.time()
time += stop - start # measured by a normal scale ruler
for i in range(10000000):
start = dt.time()
if success or can_test: # a dust particle
stop = dt.time()
time += stop - start # measured by a normal scale ruler
您的代码所做的就像用普通刻度尺测量每个灰尘颗粒并将测量值相加。由于测量误差巨大,结果失真。
为了进一步调查,如果我们执行下面的代码(d记录所花费的时间及其频率):
import time as dt
from pprint import pprint
success = True
can_test = True
time = 0
d = {}
for i in range(10000000):
start = dt.time_ns()
if success and can_test: # a dust particle
stop = dt.time_ns()
diff_time = stop - start # measurement by a normal scale ruler
d[diff_time] = d.get(diff_time, 0) + 1
time += diff_time
print(f'"and" operation took: {time} ns')
print('"and" operation time distribution:')
pprint(d)
print()
time = 0
d = {}
for i in range(10000000):
start = dt.time_ns()
if success or can_test: # a dust particle
stop = dt.time_ns()
diff_time = stop - start # measurement by a normal scale ruler
d[diff_time] = d.get(diff_time, 0) + 1
time += diff_time
print(f'"or" operation took: {time} ns')
print('"or" operation time distribution:')
pprint(d)
它将打印如下:
"and" operation took: 1467442000 ns
"and" operation time distribution:
{0: 8565832,
1000: 1432066,
2000: 136,
3000: 24,
4000: 12,
5000: 15,
6000: 10,
7000: 12,
8000: 6,
9000: 7,
10000: 6,
11000: 3,
12000: 191,
13000: 722,
14000: 170,
15000: 462,
16000: 23,
17000: 30,
18000: 27,
19000: 10,
20000: 12,
21000: 11,
22000: 61,
23000: 65,
24000: 9,
25000: 2,
26000: 2,
27000: 3,
28000: 1,
29000: 4,
30000: 4,
31000: 2,
32000: 2,
33000: 2,
34000: 3,
35000: 3,
36000: 5,
37000: 4,
40000: 2,
41000: 1,
42000: 2,
43000: 2,
44000: 2,
48000: 2,
50000: 3,
51000: 3,
52000: 1,
53000: 3,
54000: 1,
55000: 4,
58000: 1,
59000: 2,
61000: 1,
62000: 4,
63000: 1,
84000: 1,
98000: 1,
1035000: 1,
1043000: 1,
1608000: 1,
1642000: 1}
"or" operation took: 1455555000 ns
"or" operation time distribution:
{0: 8569860,
1000: 1428228,
2000: 131,
3000: 31,
4000: 22,
5000: 8,
6000: 8,
7000: 6,
8000: 3,
9000: 6,
10000: 3,
11000: 4,
12000: 173,
13000: 623,
14000: 174,
15000: 446,
16000: 28,
17000: 22,
18000: 31,
19000: 9,
20000: 11,
21000: 8,
22000: 42,
23000: 72,
24000: 7,
25000: 3,
26000: 1,
27000: 5,
28000: 2,
29000: 2,
31000: 1,
33000: 1,
34000: 2,
35000: 4,
36000: 1,
37000: 1,
38000: 2,
41000: 1,
44000: 1,
45000: 2,
46000: 2,
47000: 2,
48000: 2,
49000: 1,
50000: 1,
51000: 2,
53000: 1,
61000: 1,
64000: 1,
65000: 1,
942000: 1}
我们可以看到大约 85.7% 的尝试测量时间(8565832 / 10000000等于0.8565832和8569860 / 10000000等于0.8569860)都失败了,因为它只测量了0纳秒。大约 14.3% 的尝试测量时间(1432066 / 10000000等于0.1432066和1428228/10000000等于0.1428228)测量的是1000纳秒。1000而且,不用说,尝试测量时间的其余部分(不到 0.1%)也导致了纳秒的销售。我们可以看到,微秒级太稀疏,无法衡量每次执行所花费的时间。
但是我们仍然可以使用普通的刻度尺。通过收集灰尘颗粒并使用尺子测量灰尘球。所以我们可以试试下面的代码:
import time as dt
success = True
can_test = True
start = dt.time()
for i in range(10000000): # getting together the dust particles
if success and can_test: # a dust particle
pass
stop = dt.time()
time = stop - start # measure the size of the dustball
print(f'"and" operation took: {time} seconds')
start = dt.time()
for i in range(10000000): # getting together the dust particles
if success or can_test: # a dust particle
pass
stop = dt.time()
time = stop - start # measure the size of the dustball
print(f'"or" operation took: {time} seconds')
它将打印如下:
"and" operation took: 0.6261420249938965 seconds
"or" operation took: 0.48876094818115234 seconds
或者,我们可以使用一把细尺 dt.perf_counter(),它可以精确测量出每一个灰尘颗粒的大小,如下所示:
import time as dt
success = True
can_test = True
time = 0
for i in range(10000000):
start = dt.perf_counter()
if success and can_test: # a dust particle
stop = dt.perf_counter()
time += stop - start # measured by a fine-scale ruler
print(f'"and" operation took: {time} seconds')
time = 0
for i in range(10000000):
start = dt.perf_counter()
if success or can_test: # a dust particle
stop = dt.perf_counter()
time += stop - start # measured by a fine-scale ruler
print(f'"or" operation took: {time} seconds')
它将打印如下:
"and" operation took: 1.6929048989996773 seconds
"or" operation took: 1.3965214280016083 seconds
当然,True or True比True and True!
添加回答
举报