您的当前位置:首页正文

验证码识别基础方法及源码==

2023-10-13 来源:易榕旅网
验证码识别基础方法及源码 先说说写这个的背景

最近有朋友在搞一个东西,已经做的挺不错了,最后想再完美一点,于是乎就提议把这种验证码给K.O.了,于是乎就K.O.了这个验证码。达到单个图片识别时间小于200ms,500个样本人工统计正确率为95%。由于本人没有相关经验,是摸着石头过河。本着经验分享的精神,分享一下整个分析的思路。在各位大神面前献丑了。

再看看部分识别结果

是不是看着很眼熟?

处理第一步,去背景噪音和二值化 对于这一块,考虑了几种方法。

方法一,统计图片颜色分布,颜色占有率低的判定为背景噪音。由于背景噪音和前景色区分并不明显,尝试了很多种取景方法都不能很好去除背景噪音,最终放弃了这种方法。

方法二,事后在网上稍微查了下,最近比较流行计算灰度后设定一个阈值进行二值化。其实所谓的灰度图片原理是根据人眼对色彩敏感度取了权值,这个权值对计算机来说没有什么意义。稍微想一下就可以发现,这两个过程完全可以合并。于是乎我一步完成了去背景噪音和二值化。阈值设置为RGB三分量之和到500。结果非常令人满意。

处理第二步,制作字符样本

样本对于计算机来说是非常重要的,因为计算机很难有逻辑思维,就算有逻辑思维也要经过长期训练才能让你满意。所以要用事先制作好的样本进行比较。如果你仔细观察过这些验证码会发现一个bug,几乎大部分的验证码都是使用同样的字体,于是乎就人工制作了一套字体的样本。由于上一步已经有去除背景噪音的结果,可以直接利用。制作样本这一步有点简单枯燥,还需要细心。可能因为你的一个不细心会导致某个符号的识别率偏低。在这500个样本中,只发现了31个字符。幸亏是某部门的某人员还考虑到了易错的字符,例如,1和I,0和O等。要不然这个某部门要背负更多的骂名。

处理第三步,匹配

单个匹配用了最简单最原始的二值比较,不过匹配的是匹配率而不是匹配数。我定义了相关的计分原则。大原则是“该有的有了加分,该有的没了减分,不该有的有了适度减分,可达区域外的不算分”。

由于一些符号的部分区域匹配结果跟另一些符号的完整匹配结果相似,需要把单个匹配在一个扩大的区域内择优。在一定的范围内,找到一个最佳匹配,这个最佳匹配就是当前位置对应的符号。

完成了一次最佳匹配,可以把匹配位置向右推进一大步,若找不到合适的最佳匹配就向右推进一小步。

处理第四步,优化和调整

任何一个算法都是需要优化和调整的。现在要找到最佳参数配置和最佳代码组织。这一步往往是需要花费最多时间和精力的。

处理第五步,验证结果

这一步呢,纯人力验证结果,统计出正确率。 思考

结果是出来了,代码也不多,效果也很理想。搞这一行的,很多时候都想要通用的。能否通用,很大程度上在于抽象层次。本方法只是单纯的匹配,自然不能通用,但是方法和思想却是通用的。具体案例具体分析。至于扭曲文字、空心文字等,处理要复杂的多。网上也有一些使用第三方图像库的方法,也许那些方法会比较通用。等有空了有兴趣了继续搞一下这个主题。 源码

至于这个源码要不要发布,纠结了一段时间。网上已经有类似的商业活动了,而且这个识别本身没有太大难度,再加上某系统天生的bug,此验证码本身就相当于没有设置,因此发布此代码,仅作于学习交流。

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 using System.Collections.Generic; using System.Drawing; using System.IO; using System.IO.Compression; namespace Crack12306Captcha { public class Cracker { List words_ = new List(); public Cracker() { var bytes = new byte[] { 0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0xc5, 0x58, 0xd9, 0x92, 0x13, 0x31, 0x0c, 0x94, 0x9e, 0x93, 0x0c, 0x61, 0x97, 0x2f, 0xe1, 0x58, 0xe0, 0x91, 0x9b, 0x82, 0x62, 0x0b, 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 0x58, 0xee, 0xff, 0xff, 0x10, 0xd8, 0xcc, 0xc8, 0xea, 0x96, 0x6c, 0x8f, 0x13, 0x48, 0xe1, 0xaa, 0x4d, 0x46, 0x96, 0x6d, 0xb5, 0x8e, 0x96, 0x67, 0x73, 0x7f, 0x3b, 0x09, 0x0e, 0x25, 0x41, 0x49, 0xa3, 0xae, 0xd7, 0x5b, 0xa9, 0xa8, 0xd5, 0xb4, 0x76, 0x02, 0x6a, 0x5c, 0x52, 0x94, 0x54, 0xed, 0x18, 0x5a, 0x7f, 0x18, 0x00, 0x00, 0x84, 0x07, 0x1b, 0x80, 0x4a, 0x9a, 0x08, 0x35, 0xb8, 0x81, 0x50, 0xe7, 0xad, 0xbe, 0xc4, 0x8e, 0xb1, 0x4f, 0x2d, 0x5f, 0xba, 0x80, 0xbb, 0xfd, 0x9a, 0xad, 0x19, 0x36, 0xe5, 0xad, 0x87, 0xf1, 0x10, 0xc0, 0x8d, 0xc6, 0x50, 0x40, 0x52, 0xf8, 0xb3, 0x98, 0x2c, 0xd6, 0xec, 0x59, 0xe7, 0x0d, 0x3e, 0x0f, 0x93, 0x3e, 0x1d, 0x02, 0x7a, 0x18, 0x8f, 0xb6, 0xc7, 0x46, 0x4e, 0x01, 0xa3, 0x96, 0xdc, 0x3a, 0x20, 0x77, 0xbf, 0x2c, 0x24, 0xe4, 0x80, 0xa9, 0x20, 0x14, 0xe5, 0x2d, 0xb5, 0x68, 0xc9, 0x55, 0x89, 0x23, 0x96, 0x82, 0xaa, 0xba, 0x58, 0xa6, 0x03, 0x38, 0x71, 0x4b, 0x29, 0xd2, 0x47, 0x80, 0xe3, 0x84, 0x91, 0xf4, 0x78, 0x43, 0x64, 0x41, 0x7b, 0x73, 0x99, 0x80, 0x42, 0x48, 0x00, 0xde, 0x00, 0x12, 0x88, 0x80, 0xdb, 0x51, 0x4a, 0x49, 0x84, 0x43, 0xf6, 0x51, 0x90, 0x27, 0x21, 0xc9, 0xf8, 0xac, 0x00, 0x4d, 0xcd, 0x46, 0x09, 0x9d, 0x15, 0x78, 0xe0, 0x00, 0x1e, 0x44, 0x2a, 0x51, 0x8c, 0xbc, 0xd3, 0xa3, 0x68, 0x8a, 0xd5, 0x3a, 0x20, 0x79, 0xba, 0x4d, 0x71, 0x4c, 0x0b, 0x91, 0x98, 0x90, 0x7b, 0x2a, 0x42, 0xc5, 0x78, 0x7a, 0xfc, 0xd5, 0x1b, 0x4b, 0x09, 0xa7, 0x27, 0x99, 0x38, 0x05, 0x01, 0xc2, 0x80, 0x39, 0x9c, 0x67, 0xbb, 0x4e, 0x7f, 0x6c, 0x33, 0xdd, 0xed, 0x87, 0x55, 0xda, 0x5d, 0xb5, 0x56, 0x33, 0xc6, 0xf9, 0xea, 0x60, 0x64, 0xcf, 0xa7, 0x41, 0xe0, 0x5c, 0x1c, 0xc4, 0xb2, 0x25, 0xa3, 0x89, 0x88, 0x8d, 0x16, 0x00, 0xb5, 0xed, 0xa5, 0x22, 0x9d, 0x52, 0x41, 0x53, 0x8d, 0x92, 0x7f, 0x31, 0x51, 0x3f, 0xa8, 0x00, 0x85, 0x8a, 0x71, 0x10, 0x92, 0x78, 0xc4, 0x59, 0x08, 0x39, 0x69, 0xa9, 0x38, 0x41, 0x48, 0xf7, 0x40, 0x5a, 0x03, 0xd5, 0x3a, 0xf5, 0xe5, 0x9d, 0x33, 0x66, 0xc3, 0xd7, 0x1f, 0xef, 0x94, 0xa0, 0x53, 0xea, 0xf4, 0x15, 0xb2, 0x1c, 0x40, 0x2d, 0xcf, 0xaf, 0xce, 0xe9, 0xd4, 0x7a, 0x89, 0x09, 0xe6, 0xdd, 0xdb, 0x0e, 0xb8, 0x58, 0xa7, 0x60, 0x37, 0xfd, 0xf2, 0xfa, 0x2c, 0x4e, 63 0x51, 0x87, 0x0d, 0xfc, 0x16, 0x72, 0x2a, 0x5f, 0xc0, 0x80, 64 0xf0, 0x54, 0xa7, 0xde, 0xfc, 0x15, 65 0x8b, 0x9a, 0x36, 0x3a, 0x2c, 0x62, 0xfc, 0xd4, 0x8c, 0x31, 66 0xb7, 0xea, 0xd7, 0x26, 0xc4, 0xaf, 67 0x75, 0xea, 0xdb, 0x8b, 0xff, 0x9b, 0x9b, 0x50, 0x7e, 0xfe, 68 0x15, 0xab, 0x17, 0x2f, 0x96, 0x96, 69 0xbd, 0xaa, 0x87, 0xdd, 0x77, 0xa3, 0x77, 0xd3, 0x85, 0xf0, 70 0xe0, 0x58, 0xd5, 0xf6, 0x8c, 0xcd, 71 0xc4, 0x63, 0x52, 0x12, 0x48, 0x46, 0x0f, 0x93, 0x5a, 0xe3, 72 0xea, 0x24, 0x67, 0x73, 0x63, 0xa0, 73 0xdf, 0xdf, 0x3d, 0x67, 0xf6, 0xa9, 0xfc, 0xed, 0x08, 0xe3, 74 0x82, 0x57, 0x08, 0x35, 0x47, 0x68, 75 0x9c, 0x01, 0x40, 0x87, 0x8b, 0xbd, 0x0c, 0xb3, 0xf4, 0xe1, 76 0x72, 0xd7, 0x54, 0x62, 0xfd, 0x40, 77 0xed, 0x99, 0xa6, 0x7e, 0x2b, 0xe4, 0xb4, 0xc4, 0x62, 0x0d, 78 0x79, 0xae, 0x1b, 0xd7, 0xf4, 0x09, 79 0xb7, 0xe1, 0x7c, 0x44, 0x09, 0x9a, 0xda, 0xff, 0x52, 0x6a, 80 0x3c, 0xe1, 0xc8, 0xd7, 0xbd, 0xbb, 81 0xbe, 0x37, 0xfc, 0xd6, 0xd5, 0x4e, 0x3c, 0x40, 0x2a, 0x4b, 82 0x39, 0x1a, 0xbd, 0x2a, 0xcd, 0xc1, 83 0x18, 0x59, 0x40, 0x62, 0x78, 0xec, 0x63, 0x19, 0x72, 0xf0, 84 0xcf, 0xf8, 0x38, 0xfa, 0x42, 0x3a, 85 0xc8, 0x02, 0xec, 0x5b, 0xeb, 0x8d, 0xae, 0xf1, 0x45, 0xdd, 86 0x32, 0x98, 0x35, 0x3c, 0x9f, 0xa6, 87 0x3d, 0xce, 0x13, 0xce, 0x94, 0x38, 0x87, 0x00, 0x8d, 0x85, 88 0xc4, 0x70, 0x17, 0x26, 0x0e, 0xa6, 89 0x1e, 0x16, 0xcb, 0xbf, 0x52, 0xdf, 0x29, 0x63, 0xc4, 0xf6, 90 0x8c, 0x35, 0xba, 0xf2, 0xf9, 0x1f, 91 0xbf, 0x73, 0x1f, 0x91, 0x1b, 0x9e, 0x24, 0x5e, 0x63, 0x22, 92 0x82, 0x23, 0x05, 0x19, 0xb9, 0x71, 93 0x73, 0xdc, 0xcf, 0x05, 0x88, 0x94, 0x71, 0xdb, 0xdd, 0x48, 94 0x10, 0xd5, 0x55, 0xb3, 0x52, 0xc3, 95 0x1b, 0x01, 0x94, 0x13, 0x74, 0x94, 0x3a, 0x80, 0x2f, 0x39, 96 0xe2, 0x75, 0x0e, 0xf2, 0xc6, 0x18, 97 0xdc, 0x46, 0xfc, 0xf3, 0xea, 0x14, 0x80, 0xc1, 0xce, 0x24, 98 0xee, 0x72, 0xed, 0x94, 0xaf, 0xfb, 99 0xa9, 0xaa, 0x4a, 0xe0, 0xd4, 0x22, 0xc6, 0xf0, 0x57, 0x1d, 100 0x8e, 0xd2, 0x90, 0xc6, 0x0c, 0xd3, 101 0x9a, 0x53, 0xfb, 0xd6, 0xb7, 0xdd, 0x14, 0xd4, 0xbd, 0x41, 102 0xa7, 0x80, 0x7b, 0x23, 0xfe, 0x34, 103 0x56, 0x0d, 0x96, 0x46, 0x02, 0xfe, 0xfd, 0xb2, 0x00, 0x5f, 104 0x01, 0x9c, 0xa0, 0x32, 0x39, 0xd7, 105 0x90, 0xc2, 0x6c, 0xc7, 0x4e, 0x68, 0x88, 0x7d, 0x9f, 0x9b, 106 0xcf, 0xa7, 0xbe, 0xa0, 0xfc, 0x18, 107 0x7d, 0x07, 0x5b, 0xa9, 0xbe, 0x56, 0x1f, 0x67, 0x1a, 0x4a, 108 0x91, 0x9c, 0x04, 0x38, 0x53, 0x6b, 109 0x70, 0x68, 0x8f, 0xea, 0xf4, 0x34, 0x87, 0x7f, 0x6e, 0x82, 110 0xc3, 0xc1, 0xab, 0x40, 0xc4, 0x50, 111 0x13, 0x0e, 0x33, 0x5d, 0x67, 0x7d, 0x01, 0x1f, 0xdb, 0xc0, 112 0x7f, 0xed, 0x87, 0x7f, 0xbc, 0x0f, 113 0x75, 0xe0, 0xa5, 0xba, 0xc0, 0x84, 0x3d, 0x24, 0x04, 0xe0, 114 0xf1, 0x16, 0x41, 0x3b, 0x74, 0xd2, 115 0x52, 0xc5, 0xf8, 0x7c, 0x12, 0xfb, 0xe4, 0x37, 0x5b, 0xfb, 116 0x57, 0x11, 0xa1, 0x18, 0x00, 0x00, 117 }; 118 using (var stream = new MemoryStream(bytes)) 119 using (var gzip = new GZipStream(stream, 120 CompressionMode.Decompress)) 121 using (var reader = new BinaryReader(gzip)) 122 { 123 while (true) 124 { 125 char ch = reader.ReadChar(); 126 if (ch == '\\0') 127 break; 128 int width = reader.ReadByte(); 129 int height = reader.ReadByte(); 130 131 bool[,] map = new bool[width, height]; 132 for (int i = 0; i < width; i++) 133 for (int j = 0; j < height; j++) 134 map[i, j] = reader.ReadBoolean(); 135 words_.Add(new CharInfo(ch, map)); 136 } 137 } 138 } 139 140 public string Read(Bitmap bmp) 141 { 142 var result = string.Empty; 143 var width = bmp.Width; 144 var height = bmp.Height; 145 var table = ToTable(bmp); 146 var next = SearchNext(table, -1); 147 148 while (next < width - 7) 149 { 150 var matched = Match(table, next); 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 int y0) 191 192 193 194 } if (matched.Rate > 0.6) { result += matched.Char; next = matched.X + 10; } else { next += 1; } return result; } private bool[,] ToTable(Bitmap bmp) { var table = new bool[bmp.Width, bmp.Height]; for (int i = 0; i < bmp.Width; i++) for (int j = 0; j < bmp.Height; j++) { var color = bmp.GetPixel(i, j); table[i, j] = (color.R + color.G + color.B < 500); } return table; } private int SearchNext(bool[,] table, int start) { var width = table.GetLength(0); var height = table.GetLength(1); for (start++; start < width; start++) for (int j = 0; j < height; j++) if (table[start, j]) return start; return start; } private double FixedMatch(bool[,] source, bool[,] target, int x0, { double total = 0; double count = 0; int targetWidth = target.GetLength(0); 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 int start) 229 230 231 int int int int targetHeight = target.GetLength(1); sourceWidth = source.GetLength(0); sourceHeight = source.GetLength(1); x, y; for (int i = 0; i < targetWidth; i++) { x = i + x0; if (x < 0 || x >= sourceWidth) continue; for (int j = 0; j < targetHeight; j++) { y = j + y0; if (y < 0 || y >= sourceHeight) continue; } } if (target[i, j]) { total++; if (source[x, y]) count++; else count--; } else if (source[x, y]) count -= 0.55; return count / total; } private MatchedChar ScopeMatch(bool[,] source, bool[,] target, { int int int int targetWidth = target.GetLength(0); targetHeight = target.GetLength(1); sourceWidth = source.GetLength(0); sourceHeight = source.GetLength(1); double max = 0; var matched = new MatchedChar(); for (int i = -2; i < 6; i++) for (int j = -3; j < sourceHeight - targetHeight + 5; j++) start, j); { double rate = FixedMatch(source, target, i + } return if { } } matched; (rate > max) max = rate; matched.X = i + start; matched.Y = j; matched.Rate = rate; private MatchedChar Match(bool[,] source, int start) { MatchedChar best = null; foreach (var info in words_) { var matched = ScopeMatch(source, info.Table, start); matched.Char = info.Char; if (best == null || best.Rate < matched.Rate) best = matched; } return best; } private class CharInfo { public char Char { get; private set; } public bool[,] Table { get; private set; } } public CharInfo(char ch, bool[,] table) { Char = ch; Table = table; } private class { public public public MatchedChar int X { get; set; } int Y { get; set; } char Char { get; set; } public double Rate { get; set; } } } } 用法

1 2

var cracker = new Cracker(); var result = cracker.Read(img);

因篇幅问题不能全部显示,请点此查看更多更全内容