首页猿问如何优化霍夫曼解码？

如何优化霍夫曼解码？

Java

红颜莎娜 2022-01-12 15:05:51

所以我一直在尝试使用霍夫曼解码，我有这个工作功能，但它具有可怕的时间和空间复杂性。到目前为止，我一直在做的是读取每个字节，获取每个位并将其添加到 currentBitString。然后我反转字符串，并将其添加到一个巨大的字符串中，该字符串基本上包含文件的所有字节数据。在那之后，我会追踪这个巨大的字符串并寻找霍夫曼代码，然后如果它匹配，我会写入文件。这段代码解码一个 200kb 大约需要 60 秒，这非常糟糕，但我不确定如何改进它？我知道对于初学者来说，我可以一次向文件写入一个以上的字节，但它似乎并没有改善我尝试的时间？ public static void decode(File f) throws Exception { BufferedInputStream fin = new BufferedInputStream(new FileInputStream(f)); int i = f.getName().lastIndexOf('.'); String extension="txt"; String newFileName=f.getName().substring(0, i)+extension; File nf = new File(newFileName); BufferedOutputStream fw = new BufferedOutputStream(new FileOutputStream(nf)); int c; byte bits; byte current; String currentBitString=""; String bitString=""; //read each byte from file, reverse it, add to giant bitString //reads ALL BYTES while( (c=fin.read())!=-1 ) { current=(byte) c; currentBitString=""; bits=0; for(int q=0;q<8;q++) { bits=getBit(current,q); currentBitString+=bits; } StringBuilder bitStringReverse=new StringBuilder(currentBitString); bitString+=bitStringReverse.reverse().toString(); } currentBitString=""; boolean foundCode=false; for(int j=0;j<bitString.length();j++) { currentBitString+=bitString.charAt(j); for(int k=0;k<nodes.length;k++) { //nodes is an array of huffman nodes which contains the the byte //data and the huffman codes for each byte if(nodes[k].code.compareTo(currentBitString.trim())==0) { fw.write(nodes[k].data); foundCode=true; break; } } if(foundCode) { currentBitString=""; foundCode=false; } } fw.flush(); fw.close(); fin.close();}这是 gitBit 函数 public static byte getBit(byte ID, int position) { // return cretin bit in selected byte return (byte) ((ID >> position) & 1); }

查看完整描述

2 回答

Smart猫小萌

TA贡献1911条经验获得超7个赞

不要将整个事情读入内存。处理遇到的代码。读取足够的位以解码下一个代码，对其进行解码，为后续代码保留未使用的位，重复。
不要使用字符串来表示位，每个字符代表一位。使用位来表示位。shift, and, and or 运算符是您应该使用的。您将有一个整数作为位缓冲区，其中包含解码下一个代码所需的所有位。
不要搜索所有代码长度，而是在其中线性搜索所有代码以找到您的代码！我很难想出一个更慢的方法。您应该使用树下降或表查找进行解码。如果您首先生成规范的霍夫曼代码，则可以实现一种简单的查找方法。有关示例，请参见puff.c。教科书的方法（比 puff.c 做的要慢）是在接收端构建相同的 Huffman 树，然后一点一点地沿着树向下直到你得到一个符号。发出符号并重复。

您应该能够在现代处理器的单个内核上在几毫秒内处理 200K 的压缩输入。

反对回复 2022-01-12

神不在的星期二

TA贡献1963条经验获得超6个赞

您可以替换字符串concatination+=用StringBuilder。这会分配更少的对象并减少垃圾收集器的负载。

int c;

StringBuilder bitString = new StringBuilder();

//read each byte from file, reverse it, add to giant bitString

//reads ALL BYTES

while ((c = fin.read()) != -1) {

byte current = (byte) c;

StringBuilder currentBitString = new StringBuilder();

for (int q = 0; q < 8; q++) {

byte bits = getBit(current, q);

currentBitString.append(bits);

}

bitString.append(currentBitString.reverse());

}

而不是将代码和数据放入数组中，nodes您应该在HashMap此处使用 a 。您通过迭代整个数组来比较代码，直到找到正确的匹配项。平均而言，这是n/2对String#equals()每个项目的调用。使用 aHashMap您将其减少到〜1。

使用代码作为键的数据填充您的地图。

Map<String, Integer> nodes = new HashMap<>();

nodes.put(code, data);

从地图访问数据

boolean foundCode = false;

for (int j = 0; j < bitString.length(); j++) {

currentBitString.append(bitString.charAt(j));

Integer data = nodes.get(currentBitString.toString().trim());

if (data != null) {

fw.write(data);

foundCode = true;

}

if (foundCode) {

currentBitString = new StringBuilder();

foundCode = false;

}

反对回复 2022-01-12

2 回答
0 关注
162 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

如何优化霍夫曼解码？

如何优化霍夫曼解码？

2 回答

添加回答