首页手记搜索算法入门教程：轻松掌握搜索算法基础

搜索算法入门教程：轻松掌握搜索算法基础

标签：

算法与数据结构

概述

搜索算法是一种用于遍历或搜索特定数据结构的算法，广泛应用于路径查找、游戏AI、网络爬虫等领域。本文详细介绍了无信息搜索和有信息搜索的不同类型，如广度优先搜索、深度优先搜索、Dijkstra算法和A*搜索算法，并提供了相应的实现代码。文章还探讨了搜索算法的优化技术和应用场景，帮助读者全面理解搜索算法的原理和应用。

搜索算法简介

搜索算法的定义

搜索算法是一种用于遍历或搜索特定数据结构（如树、图等）以查找特定元素或路径的算法。它通常用于解决各种问题，如路径查找、图论问题、数据挖掘等。搜索算法旨在找到从起点到终点的最短路径，或者在特定数据结构中找到特定的元素。

搜索算法的分类

搜索算法可以分为两大类：无信息搜索和有信息搜索。

无信息搜索（Uninformed Search）：这类搜索算法不依赖于任何额外的信息。它们主要用于简单的数据结构，如树或图。主要类型有广度优先搜索（BFS）、深度优先搜索（DFS）等。
有信息搜索（Informed Search）：这类搜索算法依赖于额外的信息，如启发式函数。它们通常用于更复杂的路径查找问题。主要类型有Dijkstra算法、A*搜索算法等。

搜索算法的应用场景

搜索算法广泛应用于各种领域，包括但不限于：

路径查找：在图中找到从一个节点到另一个节点的最短路径。
迷宫问题：找到从起点到终点的最短路径。
游戏AI：如在棋类游戏中找到最佳走法。
网络爬虫：遍历网站结构以获取特定信息。
数据挖掘：在网络数据中查找特定的模式或连接。
图论问题：如寻找图中的最短路径、最小生成树等。

例如，在自动驾驶汽车中，路径规划需要使用搜索算法来找出从当前位置到目的地的最短路径。在社交网络分析中，搜索算法可以用于识别用户之间的关系和影响网络。在网络爬虫中，搜索算法可以用于确定网站结构并从网站中提取重要信息。

广度优先搜索算法

广度优先搜索（BFS）是一种无信息搜索算法，从根节点开始，逐层遍历图中的节点。对于每个节点，它先访问所有子节点，再访问下一层的子节点。这使得BFS适用于找到最短路径的问题。

实现广度优先搜索算法

广度优先搜索使用队列来存储待访问的节点。每次从队列中取出一个节点，访问其邻居，并将邻居节点添加到队列中。

from collections import deque

def bfs(graph, start):
    visited = set()
    queue = deque([start])
    visited.add(start)

    while queue:
        vertex = queue.popleft()
        print("访问节点:", vertex)
        for neighbor in graph[vertex]:
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append(neighbor)

例如，在迷宫问题中，广度优先搜索可以用来找到从起点到终点的最短路径。假设迷宫由一个二维列表表示，其中0表示可以通过的路径，1表示障碍物。

from collections import deque

def bfs_maze(maze, start, end):
    rows, cols = len(maze), len(maze[0])
    visited = [[False] * cols for _ in range(rows)]
    queue = deque([(start, [start])])

    while queue:
        (current_row, current_col), path = queue.popleft()

        if (current_row, current_col) == end:
            return path

        for dr, dc in [(-1, 0), (1, 0), (0, -1), (0, 1)]:
            next_row, next_col = current_row + dr, current_col + dc

            if 0 <= next_row < rows and 0 <= next_col < cols and maze[next_row][next_col] == 0 and not visited[next_row][next_col]:
                visited[next_row][next_col] = True
                queue.append(((next_row, next_col), path + [(next_row, next_col)]))

    return None

maze = [
    [0, 1, 0, 0, 0],
    [0, 1, 0, 1, 0],
    [0, 0, 0, 1, 0],
    [0, 1, 1, 1, 0],
    [0, 0, 0, 0, 0]
]

start = (0, 0)
end = (4, 4)

path = bfs_maze(maze, start, end)
print("从", start, "到", end, "的最短路径:", path)

深度优先搜索算法

深度优先搜索（DFS）也是一种无信息搜索算法，从根节点开始，尽可能深地遍历每个分支，直到遇到叶节点，然后回溯。DFS适用于那些需要遍历整个图的情况，例如，查找连通分量或检测图中有无环。

实现深度优先搜索算法

深度优先搜索使用栈来存储待访问的节点。每次从栈中取出一个节点，并访问其邻居。使用递归实现DFS更直观。

def dfs(graph, start, visited=None):
    if visited is None:
        visited = set()
    visited.add(start)
    print("访问节点:", start)
    for neighbor in graph[start]:
        if neighbor not in visited:
            dfs(graph, neighbor, visited)
            visited.add(neighbor)

graph = {
    'A': ['B', 'C'],
    'B': ['D', 'E'],
    'C': ['F'],
    'D': [],
    'E': ['F'],
    'F': []
}

dfs(graph, 'A')

例如，在迷宫问题中，深度优先搜索可以用来寻找从起点到终点的所有可能路径。

def dfs_maze(maze, start, end, visited=None):
    if visited is None:
        visited = []
    rows, cols = len(maze), len(maze[0])

    if start[0] < 0 or start[0] >= rows or start[1] < 0 or start[1] >= cols or maze[start[0]][start[1]] == 1 or start in visited:
        return None

    visited.append(start)

    if start == end:
        return visited

    for dr, dc in [(-1, 0), (1, 0), (0, -1), (0, 1)]:
        new_start = (start[0] + dr, start[1] + dc)
        path = dfs_maze(maze, new_start, end, visited)

        if path:
            return path

    visited.pop()
    return None

maze = [
    [0, 1, 0, 0, 0],
    [0, 1, 0, 1, 0],
    [0, 0, 0, 1, 0],
    [0, 1, 1, 1, 0],
    [0, 0, 0, 0, 0]
]

start = (0, 0)
end = (4, 4)

path = dfs_maze(maze, start, end)
print("从", start, "到", end, "的最短路径:", path)

Dijkstra算法

Dijkstra算法是一种有信息搜索算法，用于找到从起点到图中每个节点的最短路径。它使用优先队列来选择下一个访问的节点，优先选择距离起点最近的未访问节点。

实现Dijkstra算法

下面是一个使用Python实现的Dijkstra算法的例子。该算法返回从起点到所有其他节点的最短路径。

import heapq

def dijkstra(graph, start):
    distances = {vertex: float('infinity') for vertex in graph}
    distances[start] = 0
    priority_queue = [(0, start)]

    while priority_queue:
        current_distance, current_vertex = heapq.heappop(priority_queue)

        for neighbor, weight in graph[current_vertex].items():
            distance = current_distance + weight

            if distance < distances[neighbor]:
                distances[neighbor] = distance
                heapq.heappush(priority_queue, (distance, neighbor))

    return distances

graph = {
    'A': {'B': 1, 'C': 4},
    'B': {'A': 1, 'C': 2, 'D': 5},
    'C': {'A': 4, 'B': 2, 'D': 1},
    'D': {'B': 5, 'C': 1}
}

print(dijkstra(graph, 'A'))

例如，在迷宫问题中，Dijkstra算法可以用来找到从起点到每个节点的最短路径。

import heapq

def dijkstra_maze(maze, start):
    rows, cols = len(maze), len(maze[0])
    distances = {(row, col): float('infinity') for row in range(rows) for col in range(cols)}
    distances[start] = 0
    priority_queue = [(0, start)]

    while priority_queue:
        current_distance, (current_row, current_col) = heapq.heappop(priority_queue)

        for dr, dc in [(-1, 0), (1, 0), (0, -1), (0, 1)]:
            next_row, next_col = current_row + dr, current_col + dc

            if 0 <= next_row < rows and 0 <= next_col < cols and maze[next_row][next_col] == 0:
                next_distance = current_distance + 1

                if next_distance < distances[(next_row, next_col)]:
                    distances[(next_row, next_col)] = next_distance
                    heapq.heappush(priority_queue, (next_distance, (next_row, next_col)))

    return distances

maze = [
    [0, 1, 0, 0, 0],
    [0, 1, 0, 1, 0],
    [0, 0, 0, 1, 0],
    [0, 1, 1, 1, 0],
    [0, 0, 0, 0, 0]
]

start = (0, 0)
distances = dijkstra_maze(maze, start)
print("从起点到每个节点的最短路径成本:", distances)

A*搜索算法

A*搜索算法是一种有信息搜索算法，用于寻找最短路径的同时考虑启发式函数（估计距离）。它结合了图搜索算法和启发式搜索，适用于路径规划问题。

实现A*搜索算法

A*算法使用优先队列来选择下一个访问的节点，优先选择到目标节点的估计成本最小的节点。

import heapq

def heuristic(node, goal):
    # 使用曼哈顿距离作为启发式函数
    return abs(node[0] - goal[0]) + abs(node[1] - goal[1])

def a_star_search(graph, start, goal):
    open_list = [(0, start)]
    came_from = {}
    g_cost = {start: 0}

    while open_list:
        current_cost, current_node = heapq.heappop(open_list)

        if current_node == goal:
            break

        for neighbor, weight in graph[current_node].items():
            tentative_g_cost = g_cost[current_node] + weight

            if tentative_g_cost < g_cost.get(neighbor, float('infinity')):
                came_from[neighbor] = current_node
                g_cost[neighbor] = tentative_g_cost
                f_cost = g_cost[neighbor] + heuristic(neighbor, goal)
                heapq.heappush(open_list, (f_cost, neighbor))

    return came_from, g_cost

graph = {
    (0, 0): {(0, 1): 1, (1, 0): 2},
    (0, 1): {(0, 0): 1, (0, 2): 1, (1, 1): 2},
    (0, 2): {(0, 1): 1, (1, 2): 2},
    (1, 0): {(0, 0): 2, (1, 1): 1},
    (1, 1): {(1, 0): 1, (1, 2): 1, (0, 1): 2},
    (1, 2): {(1, 1): 1, (0, 2): 2}
}

start = (0, 0)
goal = (1, 2)

came_from, g_cost = a_star_search(graph, start, goal)

current = goal
path = []
while current in came_from:
    path.append(current)
    current = came_from[current]
path.reverse()

print("最短路径:", path)
print("最短路径成本:", g_cost[goal])

例如，在迷宫问题中，A*搜索算法可以用来找到从起点到终点的最短路径。

import heapq

def heuristic(node, goal):
    return abs(node[0] - goal[0]) + abs(node[1] - goal[1])

def a_star_maze(maze, start, goal, max_cost=float('infinity')):
    rows, cols = len(maze), len(maze[0])
    came_from = {}
    g_cost = {start: 0}
    priority_queue = [(0, start)]
    best_cost = float('infinity')

    while priority_queue:
        current_cost, (current_row, current_col) = heapq.heappop(priority_queue)

        if current_cost >= best_cost:
            continue

        if (current_row, current_col) == goal:
            best_cost = current_cost
            break

        for dr, dc in [(-1, 0), (1, 0), (0, -1), (0, 1)]:
            next_row, next_col = current_row + dr, current_col + dc

            if 0 <= next_row < rows and 0 <= next_col < cols and maze[next_row][next_col] == 0:
                tentative_g_cost = g_cost[(current_row, current_col)] + 1

                if tentative_g_cost < g_cost.get((next_row, next_col), float('infinity')):
                    came_from[(next_row, next_col)] = (current_row, current_col)
                    g_cost[(next_row, next_col)] = tentative_g_cost
                    f_cost = tentative_g_cost + heuristic((next_row, next_col), goal)
                    heapq.heappush(priority_queue, (f_cost, (next_row, next_col)))

    return came_from, g_cost, best_cost

maze = [
    [0, 1, 0, 0, 0],
    [0, 1, 0, 1, 0],
    [0, 0, 0, 1, 0],
    [0, 1, 1, 1, 0],
    [0, 0, 0, 0, 0]
]

start = (0, 0)
goal = (4, 4)

came_from, g_cost, best_cost = a_star_maze(maze, start, goal)

current = goal
path = []
while current in came_from:
    path.append(current)
    current = came_from[current]
path.reverse()

print("最短路径:", path)
print("最短路径成本:", best_cost)

搜索算法的基本概念

状态空间

状态空间是包含所有可能状态和状态之间转换规则的集合。每个状态都可以被看作是问题的一部分，状态之间的转换规则定义了如何从一个状态到达另一个状态。

边界条件

边界条件是搜索过程中需要满足的条件。例如，搜索算法可能会遇到无法继续的状态，或者已经找到了最优解。边界条件可以用来停止搜索，或者决定是否继续搜索。

搜索策略

搜索策略定义了搜索过程中如何选择下一个访问的节点。常见的搜索策略包括：

广度优先搜索（BFS）：从根节点开始，逐层遍历图中的节点。
深度优先搜索（DFS）：从根节点开始，尽可能深地遍历每个分支，直到遇到叶节点。
Dijkstra算法：从起点开始，选择当前距离最短的节点进行访问。
*A搜索算法**：结合Dijkstra算法和启发式搜索，优先选择到目标节点的估计成本最小的节点。

搜索算法的优化

剪枝技术

剪枝技术是减少搜索空间的一种方法。它通过提前排除不可能包含解的分支来加快搜索过程。例如，在A*算法中，如果某个节点的估计成本大于当前最优解的成本，可以提前剪枝。

import heapq

def heuristic(node, goal):
    return abs(node[0] - goal[0]) + abs(node[1] - goal[1])

def a_star_maze(maze, start, goal, max_cost=float('infinity')):
    rows, cols = len(maze), len(maze[0])
    came_from = {}
    g_cost = {start: 0}
    priority_queue = [(0, start)]
    best_cost = float('infinity')

    while priority_queue:
        current_cost, (current_row, current_col) = heapq.heappop(priority_queue)

        if current_cost >= best_cost:
            continue

        if (current_row, current_col) == goal:
            best_cost = current_cost
            break

        for dr, dc in [(-1, 0), (1, 0), (0, -1), (0, 1)]:
            next_row, next_col = current_row + dr, current_col + dc

            if 0 <= next_row < rows and 0 <= next_col < cols and maze[next_row][next_col] == 0:
                tentative_g_cost = g_cost[(current_row, current_col)] + 1

                if tentative_g_cost < g_cost.get((next_row, next_col), float('infinity')):
                    came_from[(next_row, next_col)] = (current_row, current_col)
                    g_cost[(next_row, next_col)] = tentative_g_cost
                    f_cost = tentative_g_cost + heuristic((next_row, next_col), goal)
                    heapq.heappush(priority_queue, (f_cost, (next_row, next_col)))

    return came_from, g_cost, best_cost

maze = [
    [0, 1, 0, 0, 0],
    [0, 1, 0, 1, 0],
    [0, 0, 0, 1, 0],
    [0, 1, 1, 1, 0],
    [0, 0, 0, 0, 0]
]

start = (0, 0)
goal = (4, 4)

came_from, g_cost, best_cost = a_star_maze(maze, start, goal)

current = goal
path = []
while current in came_from:
    path.append(current)
    current = came_from[current]
path.reverse()

print("最短路径:", path)
print("最短路径成本:", best_cost)

优先队列的应用

优先队列是一种特殊的队列，其中元素按照优先级顺序排列。在搜索算法中，优先队列常用于选择下一个访问的节点。

import heapq

def a_star_search(graph, start, goal):
    open_list = [(0, start)]
    came_from = {}
    g_cost = {start: 0}

    while open_list:
        current_cost, current_node = heapq.heappop(open_list)

        if current_node == goal:
            break

        for neighbor, weight in graph[current_node].items():
            tentative_g_cost = g_cost[current_node] + weight

            if tentative_g_cost < g_cost.get(neighbor, float('infinity')):
                came_from[neighbor] = current_node
                g_cost[neighbor] = tentative_g_cost
                f_cost = g_cost[neighbor] + heuristic(neighbor, goal)
                heapq.heappush(open_list, (f_cost, neighbor))

    return came_from, g_cost

graph = {
    (0, 0): {(0, 1): 1, (1, 0): 2},
    (0, 1): {(0, 0): 1, (0, 2): 1, (1, 1): 2},
    (0, 2): {(0, 1): 1, (1, 2): 2},
    (1, 0): {(0, 0): 2, (1, 1): 1},
    (1, 1): {(1, 0): 1, (1, 2): 1, (0, 1): 2},
    (1, 2): {(1, 1): 1, (0, 2): 2}
}

start = (0, 0)
goal = (1, 2)

came_from, g_cost = a_star_search(graph, start, goal)

current = goal
path = []
while current in came_from:
    path.append(current)
    current = came_from[current]
path.reverse()

print("最短路径:", path)
print("最短路径成本:", g_cost[goal])

状态重用

状态重用是一种优化方法，通过保存和重用已经访问的状态来避免重复计算。例如，在Dijkstra算法中，可以通过缓存已经访问过的节点及其最短距离来加速搜索过程。

import heapq

def dijkstra_maze(maze, start):
    rows, cols = len(maze), len(maze[0])
    distances = {(row, col): float('infinity') for row in range(rows) for col in range(cols)}
    distances[start] = 0
    priority_queue = [(0, start)]
    visited = set()

    while priority_queue:
        current_distance, (current_row, current_col) = heapq.heappop(priority_queue)
        visited.add((current_row, current_col))

        for dr, dc in [(-1, 0), (1, 0), (0, -1), (0, 1)]:
            next_row, next_col = current_row + dr, current_col + dc

            if 0 <= next_row < rows and 0 <= next_col < cols and maze[next_row][next_col] == 0 and (next_row, next_col) not in visited:
                next_distance = current_distance + 1

                if next_distance < distances[(next_row, next_col)]:
                    distances[(next_row, next_col)] = next_distance
                    heapq.heappush(priority_queue, (next_distance, (next_row, next_col)))

    return distances

maze = [
    [0, 1, 0, 0, 0],
    [0, 1, 0, 1, 0],
    [0, 0, 0, 1, 0],
    [0, 1, 1, 1, 0],
    [0, 0, 0, 0, 0]
]

start = (0, 0)
distances = dijkstra_maze(maze, start)
print("从起点到每个节点的最短路径成本:", distances)

练习与总结

搜索算法的练习题

实现一个广度优先搜索算法，找到从起点到目标节点的最短路径。
实现一个深度优先搜索算法，找到从起点到目标节点的所有路径。
实现一个Dijkstra算法，找到从起点到图中每个节点的最短路径。
*实现一个A搜索算法，找到从起点到目标节点的最短路径。**

搜索算法的应用场景讨论

搜索算法在许多领域都有广泛的应用，例如：

路径规划：如自动驾驶汽车、无人机导航等。
游戏AI：如棋类游戏中的最佳走法搜索。
图分析：如社交网络中的关系分析、网页排名等。
数据挖掘：如从大规模数据集中查找特定模式或连接。

搜索算法的学习资源推荐

推荐学习网站：慕课网
慕课网提供了大量的搜索算法教程和实战项目，适合不同层次的学习者。此外，还可以参考一些经典文献和在线资源，以进一步加深对搜索算法的理解。

希望本文对你学习搜索算法有所帮助。

点击查看更多内容

为 TA 点赞

若觉得本文不错，就分享一下吧！

评论

评论

共同学习，写下你的评论

评论加载中...

展开查看更多评论

作者其他优质文章

正在加载中

海绵宝宝撒

JAVA开发工程师

手记
篇

粉丝

40

获赞与收藏

125

关注作者，订阅最新文章

阅读免费教程

后端通用面试教程

41个小节 30964 346

网络编程入门教程

20个小节 12736 240

Pandas 入门教程

25个小节 18633 342

推荐

评论

收藏

共同学习，写下你的评论



感谢您的支持，我会继续努力的～

扫码打赏，你说多少就多少

赞赏金额会直接到老师账户

支付方式

打开微信扫一扫，即可进行扫码打赏哦

今天注册有机会得

100积分直接送

付费专栏免费学

大额优惠券免费领

立即参与放弃机会

点击
抽奖

慕课手记新用户专享福利

恭喜你，你的运气太好了，居然抽中了 100个积分！

恭喜你，抽中了价值元的专栏！

太棒了，直接落到你账户里！

积分商城里的罗技鼠标、机械键盘、
Kindle 阅读器、小米平衡车
Apple iPad （10.2英寸）、大额优惠券
在等着你去兑换了噢

作者：

免费赠送

兑换码：1111222211 复制

优惠券可用于购买实战课、体系课
无门槛使用

先去看看，有什么好东西马上兑换我爱学习，选课去


热搜

最近搜索清空