首页手记【验证码识别专栏】今天不炼丹，用 cv 来秒验证码

【验证码识别专栏】今天不炼丹，用 cv 来秒验证码

标签：

Python

声明

本文章中所有内容仅供学习交流使用，不用于其他任何目的，不提供完整代码，抓包内容、敏感网址、数据接口等均已做脱敏处理，严禁用于商业用途和非法用途，否则由此产生的一切后果均与作者无关！

本文章未经许可禁止转载，禁止任何修改后二次传播，擅自使用本文讲解的技术而导致的任何意外，作者均不负责，若有侵权，请在公众号【K哥爬虫】联系作者立即删除！

前言

最近查看 QQ 群消息，无意间看到了粉丝们关于 opencv 的相关讨论，有热心的群友给出了大致的解决方向。同时也有很多星球伙伴，在星球分享关于验证码识别的相关知识，学习交流的氛围很好。本文就针对提问和已经存在的主题做一期总结与答疑，也是丰富验证码识别类型的相关文章：

分析目标

地址 1：aHR0cHM6Ly96bmZtLmJhaXdhbmcuY29tLw==
地址 2：aHR0cHM6Ly93d3cuZGluZ3hpYW5nLWluYy5jb20vYnVzaW5lc3MvY2FwdGNoYQ==
暂定某东旋转验证码

初识乾坤

首先我们来分析一下粉丝提到的站点 1，该站的查询接口返回的验证码图片如下：

上图分别是一张缺口图与一张完整的图像，这种类型在滑块中还是比较好处理的，完全可以利用 absdiff 来完成。在图像处理和计算机视觉领域，计算图像之间的差异是一项基本的任务。这种差异可以帮助我们识别图像中的变化、运动对象或者进行图像配准等。在 OpenCV 库中，absdiff 函数提供了一种高效的方式来计算两个图像之间的绝对差值：

特性

描述

函数名称

cv2.absdiff()

参数

src1 和 src2：两张要进行比较的图像，必须具有相同的尺寸和通道数。

返回值

返回一张新的图像，其中每个像素值是 src1 和 src2 在对应位置的绝对差异。

图像类型

支持灰度图像和彩色图像（RGB）。对于彩色图像，分别计算每个颜色通道的绝对差异。

常见用途

背景减除：检测视频中的运动物体；2. 图像对比：比较两张图像是否相似；3. 图像变化检测：找出图像间的变化。

应用场景

运动检测：通过计算背景和当前帧之间的差异检测前景物体；2. 监控：背景与前景变化检测；3. 图像注册与对比。

性能特点

快速计算两张图像之间的像素差异，适用于图像对比、背景减除等任务。

所以我们先将两张图像转为灰度，然后计算绝对差值，结果如下：

之后，使用 Canny 边缘检测提取边缘，最终使用轮廓查找，找到符合条件的轮廓即可，完整代码如下：

import cv2
import numpy as np

def detect_slider_gap_with_canny(original_path, with_hole_path):
""“
使用 Canny 边缘检测法检测滑块验证码缺口位置，并返回滑块需要移动的 x 坐标值。

参数:
original_path (str): 原图路径（无缺口）。
with_hole_path (str): 带缺口的图片路径。

返回:
int: 滑块需要移动的 x 坐标值。
”""

读取图片（灰度模式）

original = cv2.imread(original_path, cv2.IMREAD_GRAYSCALE)
with_hole = cv2.imread(with_hole_path, cv2.IMREAD_GRAYSCALE)

检查两张图片是否大小一致

if original.shape != with_hole.shape:
raise ValueError(“两张图片的尺寸不一致，请检查输入图片！”)

计算绝对差值

diff = cv2.absdiff(original, with_hole)
cv2.imshow(“diff”,diff)
cv2.waitKey(0)

使用 Canny 边缘检测提取边缘

edges = cv2.Canny(diff, threshold1=50, threshold2=150)

cv2.imshow(“edges”, edges)
cv2.waitKey(0)

查找边缘图中的轮廓

contours, _ = cv2.findContours(edges, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

遍历轮廓，找到缺口的 x 坐标

for contour in contours:

获取轮廓的边界矩形

x, y, w, h = cv2.boundingRect(contour)

假设缺口的宽度和高度有一定限制，筛选可能的缺口

if 20 < w < 100 and 20 < h < 100: # 根据具体验证码调整范围
return x

如果未检测到缺口

raise ValueError(“未能检测到缺口，请检查输入图片！”)

示例用法

if name == “main”:
original_path = “1.png” # 原图路径
with_hole_path = “2.png” # 带缺口的图片路径

try:
slider_x = detect_slider_gap_with_canny(original_path, with_hole_path)
print(f"滑块需要移动的 x 值: {slider_x}")
except Exception as e:
print(f"错误: {e}")

渐入佳境

给粉丝答疑完之后，我们再来看看星球成员最近分享的，稍微复杂点的案例，是关于差异点击类型的验证码。该验证码是基于给定的图像中，选择其中不同类型的图案或文字：

该类型的验证码如下图所示：

主要是利用 PCA 特征降维和余弦计算，关键代码如下：

def find_anomalous_image(images):

提取所有图像的特征

features = [extract_features(img) for img in images]

设置 PCA 的 n_components 为样本数和特征数的最小值

n_components = min(len(features), len(features[0]))
pca = PCA(n_components=n_components)
reduced_features = pca.fit_transform(features)

计算余弦相似度

similarity_matrix = cosine_similarity(reduced_features)
average_similarity = np.mean(similarity_matrix, axis=1)

找到平均相似度最低的图像索引

anomalous_index = np.argmin(average_similarity)
return anomalous_index

那么，借此思路我们同样也可以用其他办法来解决，既然图案有差异，那么我们可以通过模板匹配的结果来计算得分，同样还是遍历每个图案，求自身与其他图案的模板的得分平均值，最终还是利用 np.argmin 去得到平均相似度最低的字符索引，即可得到答案：

import cv2
import ddddocr
import numpy as np

初始化 ddddocr 检测器

det = ddddocr.DdddOcr(det=True, show_ad=False)

def cv_show(img):
cv2.imshow(“img”, img)
cv2.waitKey(0)
cv2.destroyAllWindows()

def preprocess_image(image):
""“
对图像进行预处理：灰度化和二值化。
”"“
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
return binary

def extract_cropped_regions(image_path):
”"“
从验证码图像中裁剪检测到的字符区域。

:param image_path: 验证码图像路径
:return: 裁剪后的字符图像列表和检测框坐标
”"“
original_image = cv2.imread(image_path)
with open(image_path, ‘rb’) as f:
image_data = f.read()
poses = det.detection(image_data) # 检测字符位置

cropped_images = []
for bbox in poses:
x_min, y_min, x_max, y_max = bbox
cropped = original_image[y_min:y_max, x_min:x_max]
cropped_images.append((cropped, bbox))

return cropped_images

def match_cropped_regions(cropped_images, threshold=0.8):
”"“
使用模板匹配比较裁剪的字符图像，找到异常字符。

:param cropped_images: 裁剪的字符图像和其对应的检测框
:param threshold: 模板匹配的相似度阈值
:return: 异常字符的索引
”""
num_images = len(cropped_images)
similarity_scores = np.zeros((num_images, num_images))

for i, (img1, _) in enumerate(cropped_images):
for j, (img2, _) in enumerate(cropped_images):
if i != j:

预处理图像

img1_processed = preprocess_image(img1)
img2_processed = preprocess_image(img2)

计算模板匹配得分

result = cv2.matchTemplate(img1_processed, img2_processed, cv2.TM_CCOEFF_NORMED)

print(result)

similarity_scores[i, j] = np.max(result)

计算每个字符与其他字符的平均相似度

print(similarity_scores)
average_similarity = np.mean(similarity_scores, axis=1)
print(average_similarity)

找到平均相似度最低的字符索引

anomalous_index = np.argmin(average_similarity)
return anomalous_index

def main():

输入验证码图像路径

captcha_image_path = “1.png” # 替换为你的验证码路径

裁剪字符区域

cropped_images = extract_cropped_regions(captcha_image_path)

cv_show(cropped_images)

找到异常字符

anomalous_index = match_cropped_regions(cropped_images)

可视化结果

original_image = cv2.imread(captcha_image_path)
for i, (_, bbox) in enumerate(cropped_images):
x_min, y_min, x_max, y_max = bbox
color = (0, 255, 0) if i != anomalous_index else (0, 0, 255) # 异常字符用红框标记
cv2.rectangle(original_image, (x_min, y_min), (x_max, y_max), color, 2)

保存并显示结果

output_path = “output.png"
cv2.imwrite(output_path, original_image)
print(f"结果已保存到: {output_path}”)

显示结果

cv2.imshow(“Matched Result”, original_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

if name == “main”:
main()

依此类推，利用孪生 SiameseResNet50 网络，同样也可以不炼丹就解决该类型验证码， ResNet50 已经在大量的图像数据集（如 ImageNet）上训练过，预训练的 ResNet50 能提供非常好的性能，尤其是在图像分类、特征提取和其他计算机视觉任务中：

import cv2
import ddddocr

import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.models as models

import numpy as np

使用 ddddocr 进行文字检测

det = ddddocr.DdddOcr(det=True, show_ad=False)

定义孪生网络模型

class SiameseResNet50(nn.Module):
def init(self):
super(SiameseResNet50, self).init()

加载预训练的ResNet50模型

resnet50 = models.resnet50(pretrained=True)

去掉最后的全连接层

self.resnet50 = nn.Sequential(*list(resnet50.children())[:-1])
self.fc = nn.Sequential(
nn.Linear(resnet50.fc.in_features, 512),
nn.ReLU(),
nn.Linear(512, 256),
)

def forward_one(self, x):

提取图像的特征

x = self.resnet50(x)
x = x.view(x.size(0), -1) # Flatten
x = self.fc(x)
return x

def forward(self, x1, x2):
output1 = self.forward_one(x1)
output2 = self.forward_one(x2)
return output1, output2

提取图像特征

def extract_features(image, model, transform):
image = cv2.resize(image, (224, 224)) # 调整为ResNet50的输入大小
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # 转换为RGB格式
image = transform(image).unsqueeze(0) # 应用变换并添加batch维度
model.eval()
with torch.no_grad():
feature = model.forward_one(image).numpy()
return feature

找到异常图像

def find_anomalous_image(images, model, transform):

提取所有图像的特征

features = [extract_features(img, model, transform) for img in images]
features = np.vstack(features)

计算特征之间的欧几里得距离

distance_matrix = np.linalg.norm(features[:, np.newaxis] - features[np.newaxis, :], axis=2)
average_distance = np.mean(distance_matrix, axis=1)
print(average_distance)

找到具有最大平均距离的图像

anomalous_index = np.argmax(average_distance)
return anomalous_index

def main():

加载预训练模型

model = SiameseResNet50()

定义图像变换

transform = transforms.Compose([
transforms.ToPILImage(),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), # Normalize RGB图像
])

读取输入图像

image_path = “1.png” # 替换为您的图像路径
original_image = cv2.imread(image_path)
with open(image_path, ‘rb’) as f:
image_data = f.read()
poses = det.detection(image_data)

裁剪检测到的区域

cropped_images = []
for bbox in poses:
x_min, y_min, x_max, y_max = bbox
cropped = original_image[y_min:y_max, x_min:x_max]
cropped_images.append(cropped)

找到最异常的图像

anomalous_index = find_anomalous_image(cropped_images, model, transform)

输出异常图像的坐标

print(f"与其他不同的图像坐标为: {poses[anomalous_index]}")

绘制矩形框，跳过最异常的图像

for i, bbox in enumerate(poses):
x_min, y_min, x_max, y_max = bbox
color = (0, 255, 0) if i == anomalous_index else (0, 0, 255) # 绿色表示异常，红色表示其他
cv2.rectangle(original_image, (x_min, y_min), (x_max, y_max), color, 2)

保存结果图像

output_path = “output.png"
cv2.imwrite(output_path, original_image)
print(f"结果已保存到: {output_path}”)

if name == “main”:
main()

崭露头角

经过上文几轮介绍，我们已经拿下 2 种类型验证码的识别，这俩种应该算比较简单的，那么对于个别类型的验证码如果单单通过降维来提取主干特征，筛选正确答案，是远远不能满足要求的。比如上文的差异点击升级以后，便是字体风格类型验证码的识别，想从主干特征几乎一模一样的字体中筛选出正确答案，我们对特征的提取是需要叠加的，还要考虑文字的字体、笔画以及大小等等因素的影响，该类型的验证码如下图所示：

可以看到，此类验证码对于特征的提取，肯定不是单纯的模板匹配或者直接相似度就能解决的。换汤不换药，我们首先还是将图像转为灰度图，然后创建一个特征容器 features，用来存储特征集合。

提取轮廓面积与周长

_, binary_image = cv2.threshold(inverted_image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
contours, _ = cv2.findContours(binary_image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

轮廓面积与周长

if contours:
largest_contour = max(contours, key=cv2.contourArea)
contour_area = cv2.contourArea(largest_contour)
contour_perimeter = cv2.arcLength(largest_contour, True)
features.extend([contour_area, contour_perimeter])

轮廓宽高比

_, _, width, height = cv2.boundingRect(largest_contour)
aspect_ratio = width / height
features.append(aspect_ratio)
else:
features.extend([0, 0, 0]) # 如果没有找到轮廓，添加 0

笔画宽度提取

这类验证码最重要的特征就是笔画了，主要提取宽度均值和宽度标准差：

kernel = np.ones((3, 3), np.uint8)
dilated_image = cv2.dilate(binary_image, kernel, iterations=1)
eroded_image = cv2.erode(binary_image, kernel, iterations=1)
stroke_width_map = dilated_image - eroded_image
stroke_width_mean = np.mean(stroke_width_map)
stroke_width_std = np.std(stroke_width_map)
features.extend([stroke_width_mean, stroke_width_std])

骨架提取

skeleton_image = skeletonize(binary_image > 0) # 先二值化，再进行骨架化
skeleton_length = np.sum(skeleton_image)
features.append(skeleton_length)

灰度统计与局部梯度

梯度统计主要用来求边缘强度，不同风格的字体边缘特征有时候有明显差别：

5. 灰度统计特征

mean_intensity = np.mean(gray_image) # 灰度均值
std_intensity = np.std(gray_image) # 灰度标准差
features.extend([mean_intensity, std_intensity])

6. 局部梯度分析（提取边缘强度）

gradient_x = cv2.Sobel(inverted_image, cv2.CV_64F, 1, 0, ksize=3)
gradient_y = cv2.Sobel(inverted_image, cv2.CV_64F, 0, 1, ksize=3)
gradient_magnitude = np.sqrt(gradient_x ** 2 + gradient_y ** 2)
gradient_mean = np.mean(gradient_magnitude)
gradient_std = np.std(gradient_magnitude)
features.extend([gradient_mean, gradient_std])

局部特征

局部特征主要用来提取字体形状和结构：

7. 局部特征（将图像分割为网格）

grid_size = 8 # 网格大小
cell_height, cell_width = resized_image.shape[0] // grid_size, resized_image.shape[1] // grid_size
local_grid_features = []

遍历每个网格，计算每个小区域的均值

for i in range(grid_size):
for j in range(grid_size):
cell = resized_image[i * cell_height:(i + 1) * cell_height, j * cell_width:(j + 1) * cell_width]
local_grid_features.append(np.mean(cell)) # 计算网格单元的均值

features.extend(local_grid_features)

最终特征合并，计算相似性矩阵，回归上文相似度计算，继续计算均值，找到平均相似度最低的索引：

similarity_matrix = cosine_similarity(features)

计算每个图像与其他图像的平均相似度

average_similarity = np.mean(similarity_matrix, axis=1)
print(“平均相似度:”, average_similarity)

找到平均相似度最低的图像索引

anomalous_index = np.argmin(average_similarity)

最终效果如下：

行云流水

走到这里，已经对 3 种验证码进行了处理，最后我们来用 cv 处理一下旋转验证码，这种方法对于中小型网站，是足够使用的，相反对于 AI 类型的验证码也是一种处理办法，对于 AI 生成的旋转验证码，模型通常没有很好的泛性进行适配，如果模型可以一劳永逸，那么风控频繁的更新将毫无意义：

感知哈希 (pHash) + 汉明距离

以某度验证码为例，图库有限的情况下，这不疑也是一种解决办法，处理思路就是通过将图片的主干特征提取出来，计算平均值，生成二进制哈希值，代码如下：

def phash(image):
resized = cv2.resize(image, (32, 32), interpolation=cv2.INTER_AREA)
gray = cv2.cvtColor(resized, cv2.COLOR_BGR2GRAY)
dct = cv2.dct(np.float32(gray))
dct_low = dct[:8, :8] # 提取左上角低频部分
mean_val = np.mean(dct_low)
hash_str = ‘’.join([‘1’ if x > mean_val else ‘0’ for x in dct_low.flatten()])
return hash_str

通过读取转正的图像，将哈希值以列表的形式读出，最终存储在序列化文件中，格式如下：

[‘1010100001000000101000000010000010000000100000001000000000000000’, ‘1011101000110010111000001010010010000001000000001000000000100010’, ‘1010101100000000111010001000000010000000000000000000000000000000’, ‘1110100000001100101000001000000010000000100000000000000000010000’, ‘1010001010100010101000000000001000000010000010001000001000001000’, ‘1110001100110000101110001100000010000000000000101000000000000000’]

那么哈希值的对比就需要用到汉明距离了。

汉明距离的基本原理

汉明距离是用于衡量两个等长字符串之间的相似程度的指标，它表示两个字符串对应位上不同字符的个数。用于计算两个哈希值的相似性。距离越小，图像越相似，反之则差异越大：

from scipy.spatial.distance import hamming

hash1 = "1100101011110000"
hash2 = “1100101011010000"
distance = hamming(list(hash1), list(hash2)) # 汉明距离
print(f"汉明距离: {distance}”)

最终通过将待旋转的图片经过 360 度旋转计算哈希值，最后找到最相似的角度，即可完成旋转验证码的识别：

import cv2
import numpy as np

def rotate_image(image, angle):
h, w = image.shape[:2]
center = (w // 2, h // 2)
matrix = cv2.getRotationMatrix2D(center, angle, 1.0)
return cv2.warpAffine(image, matrix, (w, h))

def find_best_angle(target_image, reference_image, step=1):
target_hash = phash(target_image)
best_angle, min_distance = 0, float(‘inf’)
for angle in range(0, 360, step):
rotated = rotate_image(reference_image, angle)
rotated_hash = phash(rotated)
distance = hamming(list(target_hash), list(rotated_hash))
if distance < min_distance:
best_angle, min_distance = angle, distance
return best_angle

个别角度可能顺时针、逆时针不同，需要用 360 - 计算出来的角度。

更详细的代码可以参考热心网友已经整理好的 GitHub，方法大同小异：

相关链接：https://github.com/decodecaptcha/Rotate-Captcha-Angle-Prediction

SIFT 特征匹配 + 仿射变换

SIFT 的基本原理

SIFT 是一种经典的特征提取算法，能提取图像的关键点及其局部特征，具有旋转不变性和尺度不变性，适用于旋转验证码中图像特征的匹配。

在标注阶段，首先需要准备一些已知的正确图像（即“转正”图像），这些图像的旋转角度是已知的。然后，通过计算这些图像的直方图特征，并将其保存到 pkl 文件 中，以便在后续的预测阶段使用：

import cv2
import pickle
import numpy as np

def calculate_histogram(image):
# 计算图像的灰度直方图
grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
hist = cv2.calcHist([grayscale_image], [0], None, [256], [0, 256])
# 归一化直方图
cv2.normalize(hist, hist, alpha=0, beta=1, norm_type=cv2.NORM_MINMAX)
return hist

def save_histograms(images, output_file):
histograms = {‘baidu’: {}, ‘allimg’: {}}

for idx, img_path in enumerate(images):
    img = cv2.imread(img_path)  # 读取图像
    hist = calculate_histogram(img)  # 计算图像的灰度直方图
    histograms['baidu'][f"image_{idx}"] = hist  # 将直方图保存到 'baidu' 键下
    histograms['allimg'][f"image_{idx}"] = img  # 将图像数据存储到 'allimg' 键下

# 将数据保存到 pkl 文件
with open(output_file, 'wb') as file:
    pickle.dump(histograms, file)

print(f"saved to {output_file}")

示例：保存直方图数据和图像集合

image_paths = [‘5_288.jpeg’, ‘6_268.jpeg’] # 示例图片路径
output_file = ‘baidu.pkl’ # 输出文件路径
save_histograms(image_paths, output_file)

之后读取未转正的图像，计算直方图，找到已经储存到 pkl 文件中最合适的图像，可以用 cv2.compareHist() 方法来计算图像之间的直方图相似度，找到相似的图像以后利用 SIFT 进行特征匹配和仿射变换来计算旋转角度。

流程如下：

import cv2
import time
import pickle
import numpy as np

从文件加载直方图模型

def load_histograms_from_file(file_path):
with open(file_path, ‘rb’) as file:
histograms_data = pickle.load(file)
return histograms_data

特征匹配计算图像的旋转角度

def compute_rotation_angle_by_features(reference_img, query_img):
# 将图像转换为灰度图
query_gray = cv2.cvtColor(query_img, cv2.COLOR_BGR2GRAY)
reference_gray = cv2.cvtColor(reference_img, cv2.COLOR_BGR2GRAY)

# 使用 ORB 特征提取器代替 SIFT（ORB 更加高效）
orb = cv2.ORB_create()

# 提取特征点和描述符
kp1, des1 = orb.detectAndCompute(query_gray, None)
kp2, des2 = orb.detectAndCompute(reference_gray, None)

# 使用暴力匹配器进行特征点匹配
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1, des2)
matches = sorted(matches, key=lambda x: x.distance)

# 提取匹配点
src_pts = np.float32([kp1[m.queryIdx].pt for m in matches]).reshape(-1, 1, 2)
dst_pts = np.float32([kp2[m.trainIdx].pt for m in matches]).reshape(-1, 1, 2)

# 计算变换矩阵，这里使用 RANSAC 方法剔除错误匹配
matrix, _ = cv2.estimateAffinePartial2D(src_pts, dst_pts, method=cv2.RANSAC, ransacReprojThreshold=5.0)

if matrix is not None:
    # 提取旋转角度
    angle = np.arctan2(matrix[0, 1], matrix[0, 0]) * (180 / np.pi)
else:
    angle = None
    print("未能估计变换矩阵。")

return angle

根据旋转角度旋转图像

def rotate_image(img, angle):
height, width = img.shape[:2]
# 计算旋转矩阵
rotation_matrix = cv2.getRotationMatrix2D((width / 2, height / 2), angle, 1)

# 进行旋转变换
rotated_img = cv2.warpAffine(img, rotation_matrix, (width, height), flags=cv2.INTER_CUBIC)
return rotated_img

使用直方图匹配估计图像的旋转角度

def estimate_image_angle(histograms, image_collection, query_img):
start_time = time.time()

# 计算查询图像的灰度直方图
query_gray = cv2.cvtColor(query_img, cv2.COLOR_BGR2GRAY)
query_hist = cv2.calcHist([query_gray], [0], None, [256], [0, 256])
cv2.normalize(query_hist, query_hist, 0, 1, cv2.NORM_MINMAX)

best_match_score = -1
best_match_key = None

# 查找与查询图像最相似的图像
for key, hist in histograms.items():
    similarity = cv2.compareHist(query_hist, hist, cv2.HISTCMP_CORREL)
    if similarity > best_match_score:
        best_match_score = similarity
        best_match_key = key

print(f"找到最匹配的图像: {best_match_key}")

# 使用 'best_match_key' 作为索引，从 'image_collection' 字典中获取图像
best_match_image = image_collection[best_match_key]

# 计算最佳匹配图像与查询图像之间的旋转角度
rotation_angle = compute_rotation_angle_by_features(best_match_image, query_img)
end_time = time.time()

print(f"执行时间: {end_time - start_time:.2f} 秒，旋转角度: {rotation_angle:.2f} 度。")

# 显示最匹配的图像
cv2.imshow("Best Match Image", best_match_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

return rotation_angle

加载直方图模型和图像集合

histograms_data = load_histograms_from_file(‘baidu.pkl’)
histogram_model = histograms_data[‘baidu’]
image_dataset = histograms_data[‘allimg’]
query_image_path = ‘5.jpeg’ # 示例查询图像路径

query_img = cv2.imread(query_image_path)

估计查询图像的旋转角度

estimated_angle = estimate_image_angle(histogram_model, image_dataset, query_img)

最终结果如下：

更细致的处理方法还可以创建掩码，将中心图抠出来，这样会更加准确，可以根据这个思路去设计标注工具，基本很快就能完成对抗，持续对抗是一个不错的选择。这部分标注好的数据集大概 5000 个，后续我会上传到知识星球中，仅供学习交流。

点击查看更多内容

为 TA 点赞

若觉得本文不错，就分享一下吧！

评论

评论

共同学习，写下你的评论

评论加载中...

展开查看更多评论

作者其他优质文章

正在加载中

K哥爬虫

Python工程师

手记
篇

粉丝

28

获赞与收藏

43

关注作者，订阅最新文章

阅读免费教程

Python 办公自动化教程

17个小节 25769 875

Python 算法入门教程

15个小节 27466 1075

Python 进阶应用教程

38个小节 65882 1036

推荐

评论

收藏

共同学习，写下你的评论



感谢您的支持，我会继续努力的～

扫码打赏，你说多少就多少

赞赏金额会直接到老师账户

支付方式

打开微信扫一扫，即可进行扫码打赏哦

今天注册有机会得

100积分直接送

付费专栏免费学

大额优惠券免费领

立即参与放弃机会

点击
抽奖

慕课手记新用户专享福利

恭喜你，你的运气太好了，居然抽中了 100个积分！

恭喜你，抽中了价值元的专栏！

太棒了，直接落到你账户里！

积分商城里的罗技鼠标、机械键盘、
Kindle 阅读器、小米平衡车
Apple iPad （10.2英寸）、大额优惠券
在等着你去兑换了噢

作者：

免费赠送

兑换码：1111222211 复制

优惠券可用于购买实战课、体系课
无门槛使用

先去看看，有什么好东西马上兑换我爱学习，选课去


热搜

最近搜索清空

【验证码识别专栏】今天不炼丹，用 cv 来秒验证码

声明

前言

分析目标

初识乾坤

读取图片（灰度模式）

检查两张图片是否大小一致

计算绝对差值

使用 Canny 边缘检测提取边缘

查找边缘图中的轮廓

遍历轮廓，找到缺口的 x 坐标

获取轮廓的边界矩形

假设缺口的宽度和高度有一定限制，筛选可能的缺口

如果未检测到缺口

示例用法

渐入佳境

提取所有图像的特征

设置 PCA 的 n_components 为样本数和特征数的最小值

计算余弦相似度

找到平均相似度最低的图像索引

初始化 ddddocr 检测器

预处理图像

计算模板匹配得分

print(result)

计算每个字符与其他字符的平均相似度

找到平均相似度最低的字符索引

输入验证码图像路径

裁剪字符区域

cv_show(cropped_images)

找到异常字符

可视化结果

保存并显示结果

显示结果

使用 ddddocr 进行文字检测

定义孪生网络模型

加载预训练的ResNet50模型

去掉最后的全连接层

提取图像的特征

提取图像特征

找到异常图像

提取所有图像的特征

计算特征之间的欧几里得距离

找到具有最大平均距离的图像

加载预训练模型

定义图像变换

读取输入图像

裁剪检测到的区域

找到最异常的图像

输出异常图像的坐标

绘制矩形框，跳过最异常的图像

保存结果图像

崭露头角

提取轮廓面积与周长

轮廓面积与周长

轮廓宽高比

笔画宽度提取

骨架提取

灰度统计与局部梯度

5. 灰度统计特征

6. 局部梯度分析（提取边缘强度）

局部特征

7. 局部特征（将图像分割为网格）

遍历每个网格，计算每个小区域的均值

计算每个图像与其他图像的平均相似度

找到平均相似度最低的图像索引

行云流水

感知哈希 (pHash) + 汉明距离

汉明距离的基本原理

SIFT 特征匹配 + 仿射变换

SIFT 的基本原理

示例：保存直方图数据和图像集合

从文件加载直方图模型

特征匹配计算图像的旋转角度

根据旋转角度旋转图像

使用直方图匹配估计图像的旋转角度

加载直方图模型和图像集合

估计查询图像的旋转角度

阅读免费教程