首页猿问使用多核的前缀搜索算法

使用多核的前缀搜索算法

Java

holdtom 2021-09-12 20:08:45

我有一项任务是从单词中过滤一个列表（向量）作为前缀。该算法应该使用现代多核处理器。解决方案是使用许多线程来处理列表。// PrintWriter writer = new PrintWriter("C:\\DemoList.txt", "UTF-8");// // for(char i = 'A'; i<= 'Z'; i++) {// for(char j = 'A'; j<= 'Z'; j++) {// for(char n = 'A'; n<= 'Z'; n++) {// for(char m = 'A'; m<= 'Z'; m++) {// writer.println("" + i + j + n + m );// }// // }// }// } List<String> allLines = Files.readAllLines(Paths.get("C:\\", "DemoList.txt")); Collections.shuffle(allLines); String pattern = "AA"; List<String> result = new ArrayList<>(); int cores = Runtime.getRuntime().availableProcessors(); int threadsNum = allLines.size() / cores; long start_time = System.nanoTime(); for (String word : allLines) { if (word.startsWith(pattern)) result.add(word); } long end_time = System.nanoTime(); double difference = (end_time - start_time) / 1e6; System.out.println("Time difference in Milliseconds with Brute-Force: " + difference);//With Parallisim: long new_start_time = System.nanoTime(); List<String> filteredList = allLines.parallelStream().filter(s -> s.startsWith(pattern)) .collect(Collectors.toList()); long new_end_time = System.nanoTime(); double new_difference = (new_end_time - new_start_time) / 1e6; System.out.println("Time difference in Milliseconds with Stream from Java 8: " + new_difference); 结果：使用 Brute-Force 的时间差（以毫秒为单位）：33.033602 使用 Java 8 的 Stream 的时间差（以毫秒为单位）：65.017069每个线程都应该从列表中过滤一个范围。你有更好的主意吗？您认为我应该对原始列表进行排序而不是对其进行二分查找吗？我应该在二进制排序中也使用多线程，还是应该使用 Collections.sort？你将如何实施？

查看完整描述

2 回答

一只萌萌小番薯

TA贡献1795条经验获得超7个赞

从您的代码示例中，您的时间测量方法与微基准测试相近，对于它来说，仅仅测量一次执行的时间会产生误导。

您可以在以下 StackOverflow 帖子中看到详细讨论：How do I write a correct micro-benchmark in Java?

我编写了一个更准确的基准测试，以获得对示例代码的更精确测量。该代码已在具有多线程的 QuadCore i7 上运行（但它是一台笔记本电脑，由于电源和热量管理，结果可能会略微偏向于产生更多热量的多线程代码）。

@Benchmark

public void testSequentialFor(Blackhole bh, Words words) {

List<String> filtered = new ArrayList<>();

for (String word : words.toSort) {

if (word.startsWith(words.prefix)) {

filtered.add(word);

}

bh.consume(filtered);

}

@Benchmark

public void testParallelStream(Blackhole bh, Words words) {

bh.consume(words.toSort.parallelStream()

.filter(w -> w.startsWith(words.prefix))

.collect(Collectors.toList())

);

}

@Benchmark

public void testManualThreading(Blackhole bh, Words words) {

// This is quick and dirty, bit gives a decent baseline as to

// what a manually threaded partitionning can achieve.

List<Future<List<String>>> async = new ArrayList<>();

// this has to be optimized to avoid creating "almost empty" work units

int batchSize = words.size / ForkJoinPool.commonPool().getParallelism();

int numberOfDispatchedWords = 0;

while (numberOfDispatchedWords < words.toSort.size()) {

int start = numberOfDispatchedWords;

int end = numberOfDispatchedWords + batchSize;

async.add(words.threadPool.submit(() -> {

List<String> batch = words.toSort.subList(start, Math.min(end, words.toSort.size()));

List<String> result = new ArrayList<>();

for (String word : batch) {

if (word.startsWith(words.prefix)) {

result.add(word);

}

return result;

}));

numberOfDispatchedWords += batchSize;

}

List<String> result = new ArrayList<>();

for (Future<List<String>> asyncResult : async) {

try {

result.addAll(asyncResult.get());

} catch (Exception e) {

e.printStackTrace();

}

bh.consume(result);

}

@State(Scope.Benchmark)

public static class Words {

ExecutorService threadPool = ForkJoinPool.commonPool();

List<String> toSort;

@Param({"100", "1000", "10000", "100000"})

private int size;

private String prefix = "AA";

@Setup

public void prepare() {

//a 4 to 13 letters long, random word

//for more precision, it should not be that random (use a fixed seed), but given the simple nature of the fitlering, I guess it's ok this way

Supplier<String> wordCreator = () -> RandomStringUtils.random(4 + ThreadLocalRandom.current().nextInt(10));

toSort = Stream.generate(wordCreator).limit(size).collect(Collectors.toList());

}

这是结果

基准（大小）模式 Cnt 分数误差单位

PerfTest.testManualThreading 100 thrpt 6 95971,811 ± 1100,589 ops/s

PerfTest.testManualThreading 1000 thrpt 6 76293,983 ± 1632,959 ops/s

PerfTest.testManualThreading 10000 thrpt 6 34630,814 ± 2660,058 ops/s

PerfTest.testManualThreading 100000 thrpt 6 5956,552 ± 529,368 ops/s

PerfTest.testParallelStream 100 thrpt 6 69965,462 ± 451,418 ops/s

PerfTest.testParallelStream 1000 thrpt 6 59968,271 ± 774,859 ops/s

PerfTest.testParallelStream 10000 thrpt 6 29079,957 ± 513,244 ops/s

PerfTest.testParallelStream 100000 thrpt 6 4217,146 ± 172,781 ops/s

PerfTest.testSequentialFor 100 thrpt 6 3553930,640 ± 21142,150 ops/s

PerfTest.testSequentialFor 1000 thrpt 6 356217,937 ± 7446,137 ops/s

PerfTest.testSequentialFor 10000 thrpt 6 28894,748 ± 674,929 ops/s

PerfTest.testSequentialFor 100000 thrpt 6 1725,735 ± 13,273 ops/s

所以顺序版本看起来方式更快达几千元素，它们是在同水准与手动线程10K前位，并与并行流后位，和线程代码进行更好的从那里。

考虑到编写“手动线程变体”所需的代码量，以及在那里创建错误或通过计算批量大小而导致效率低下的风险，我可能不会选择该选项，即使它看起来可能比巨大列表的流。

我不会尝试先排序，然后二分搜索，因为过滤是一个 O(N) 操作，然后排序一个 O(Nlog(N))（在它上面你必须添加一个二分搜索）。因此，除非您对数据有非常精确的模式，否则我认为它不会对您有利。

请注意，尽管不要得出此基准测试无法支持的结论。一方面，它基于这样一个假设，即过滤是程序中唯一发生的事情并为 CPU 时间而战。如果您在任何类型的“多用户”应用程序（例如 Web 应用程序）中，那么这可能不是真的，您很可能会失去一切，尽管您可以通过多线程获得。

反对回复 2021-09-12

呼啦一阵风

TA贡献1802条经验获得超6个赞

从 Java 8 开始，您可以使用流来解决以下几行问题：

List<String> yourList = new ArrayList<>(); // A list whose size requires parallelism

String yourPrefix = ""; // Your custom prefix

final List<String> filteredList = yourList.parallelStream()

.filter(s -> s.startsWith(yourPrefix))

.collect(Collectors.toList());

我建议您阅读本文并查看此问题以了解并行性是否对您有所帮助。

反对回复 2021-09-12

2 回答
0 关注
193 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

使用多核的前缀搜索算法

使用多核的前缀搜索算法

2 回答

添加回答