1 回答
TA贡献1827条经验 获得超4个赞
将您的示例输入input.txt如下
000000005AD76BD555C1D6D771DE417A4B87E4B4:4
00000000A8DAE4228F821FB418F59826079BF368:3
00000000DD7F2A1C68A35673713783CA390C9E93:630
00000001E225B908BAC31C56DB04D892E47536E0:5
00000006BAB7FC3113AA73DE3589630FC08218E7:2
00000008CD1806EB7B9B46A8F87690B2AC16F617:4
0000000A0E3B9F25FF41DE4B5AC238C2D545C7A8:15
0000000A1D4B746FAA3FD526FF6D5BC8052FDB38:16
0000000CAEF405439D57847A8657218C618160B2:15
0000000FC1C08E6454BED24F463EA2129E254D43:40
并删除计数,使您的文件变为(in.txt如下):
000000005AD76BD555C1D6D771DE417A4B87E4B4
00000000A8DAE4228F821FB418F59826079BF368
00000000DD7F2A1C68A35673713783CA390C9E93
00000001E225B908BAC31C56DB04D892E47536E0
00000006BAB7FC3113AA73DE3589630FC08218E7
00000008CD1806EB7B9B46A8F87690B2AC16F617
0000000A0E3B9F25FF41DE4B5AC238C2D545C7A8
0000000A1D4B746FAA3FD526FF6D5BC8052FDB38
0000000CAEF405439D57847A8657218C618160B2
0000000FC1C08E6454BED24F463EA2129E254D43
这将确保每个条目都有固定的大小。
现在您可以使用基于 mmap 的文件读取方法,如https://docs.python.org/3/library/mmap.html
import mmap
import os
FIELD_SIZE=40+1 # also include newline separator
def binarySearch(mm, l, r, x):
while l <= r:
mid = int(l + (r - l)/2);
# Check if x is present at mid
mid_slice = mm[mid*FIELD_SIZE:(mid+1)*FIELD_SIZE]
mid_slice = mid_slice.decode('utf-8').strip()
# print(mid_slice)
if mid_slice == x:
return mid
# If x is greater, ignore left half
elif mid_slice < x:
l = mid + 1
#print l
# If x is smaller, ignore right half
else:
r = mid - 1
#print r
# If we reach here, then the element
# was not present
return -1
# text=f.readlines()
# data=text
#print data
x = '0000000CAEF405439D57847A8657218C618160B2'
with open('in.txt', 'r+b') as f:
mm = mmap.mmap(f.fileno(), 0)
f.seek(0, os.SEEK_END)
size = f.tell()
result = binarySearch(mm, 0, size/FIELD_SIZE, x)
if result != -1:
print("Element is present at index % d" % result)
else:
print("Element is not present in array")
输出:
$ python3 find.py
Element is present at index 8
由于文件没有完全在内存中读取,因此不会出现内存不足的错误。
添加回答
举报