1 回答
![?](http://img1.sycdn.imooc.com/5458683f00017bab02200220-100-100.jpg)
TA贡献2019条经验 获得超9个赞
您想要实现的是访问捕获组。我更喜欢命名捕获组,并且有一个非常简单的辅助函数可以处理这个问题:
package main
import (
"fmt"
"regexp"
)
// Our example input
const input = "X-sync-status: done\r\n"
// We anchor the regex to the beginning of a line with "^".
// Then we have a fixed string until our capturing group begins.
// Within our capturing group, we want to have all consecutive non-whitespace,
// non-control characters following.
const regexString = `(?i)^X-sync-status: (?P<status>\w*)`
// We ensure our regexp is valid and can be used.
var syncStatusRegexp *regexp.Regexp = regexp.MustCompile(regexString)
// The helper function...
func namedResults(re *regexp.Regexp, in string) map[string]string {
// ... does the matching
match := re.FindStringSubmatch(in)
result := make(map[string]string)
// and puts the value for each named capturing group
// into the result map
for i, name := range re.SubexpNames() {
if i != 0 && name != "" {
result[name] = match[i]
}
}
return result
}
func main() {
fmt.Println(namedResults(syncStatusRegexp, input)["status"])
}
注意您当前的正则表达式有些错误,因为您也会捕获空格。使用当前的正则表达式,结果将是“done”而不是“done”。
编辑:当然,如果没有正则表达式,您可以更便宜地做到这一点:
fmt.Print(strings.Trim(strings.Split(input, ":")[1], " \r\n"))
Edit2我很好奇 split 方法便宜多少,因此我想出了非常粗略的方法:
package main
import (
"fmt"
"log"
"regexp"
"strings"
)
// Our example input
const input = "X-sync-status: done\r\n"
// We anchor the regex to the beginning of a line with "^".
// Then we have a fixed string until our capturing group begins.
// Within our capturing group, we want to have all consecutive non-whitespace,
// non-control characters following.
const regexString = `(?i)^X-sync-status: (?P<status>\w*)`
// We ensure our regexp is valid and can be used.
var syncStatusRegexp *regexp.Regexp = regexp.MustCompile(regexString)
func statusBySplit(in string) string {
return strings.Trim(strings.Split(input, ":")[1], " \r\n")
}
func statusByRegexp(re *regexp.Regexp, in string) string {
return re.FindStringSubmatch(in)[1]
}
[...]
和一个小基准:
package main
import "testing"
func BenchmarkRegexp(b *testing.B) {
for i := 0; i < b.N; i++ {
statusByRegexp(syncStatusRegexp, input)
}
}
func BenchmarkSplit(b *testing.B) {
for i := 0; i < b.N; i++ {
statusBySplit(input)
}
}
然后,我让它们分别在 1 个、2 个和 4 个可用的 CPU 上运行 5 次。恕我直言,结果非常有说服力:
go test -run=^$ -test.bench=. -test.benchmem -test.cpu 1,2,4 -test.count=5
goos: darwin
goarch: amd64
pkg: github.com/mwmahlberg/so-regex
BenchmarkRegexp 5000000 383 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp 5000000 382 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp 5000000 382 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp 5000000 382 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp 5000000 384 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp-2 5000000 384 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp-2 5000000 382 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp-2 5000000 384 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp-2 5000000 382 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp-2 5000000 382 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp-4 5000000 382 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp-4 5000000 382 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp-4 5000000 380 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp-4 5000000 380 ns/op 32 B/op 1 allocs/op
BenchmarkRegexp-4 5000000 377 ns/op 32 B/op 1 allocs/op
BenchmarkSplit 10000000 161 ns/op 80 B/op 3 allocs/op
BenchmarkSplit 10000000 161 ns/op 80 B/op 3 allocs/op
BenchmarkSplit 10000000 164 ns/op 80 B/op 3 allocs/op
BenchmarkSplit 10000000 165 ns/op 80 B/op 3 allocs/op
BenchmarkSplit 10000000 162 ns/op 80 B/op 3 allocs/op
BenchmarkSplit-2 10000000 159 ns/op 80 B/op 3 allocs/op
BenchmarkSplit-2 10000000 167 ns/op 80 B/op 3 allocs/op
BenchmarkSplit-2 10000000 161 ns/op 80 B/op 3 allocs/op
BenchmarkSplit-2 10000000 159 ns/op 80 B/op 3 allocs/op
BenchmarkSplit-2 10000000 159 ns/op 80 B/op 3 allocs/op
BenchmarkSplit-4 10000000 159 ns/op 80 B/op 3 allocs/op
BenchmarkSplit-4 10000000 161 ns/op 80 B/op 3 allocs/op
BenchmarkSplit-4 10000000 159 ns/op 80 B/op 3 allocs/op
BenchmarkSplit-4 10000000 160 ns/op 80 B/op 3 allocs/op
BenchmarkSplit-4 10000000 160 ns/op 80 B/op 3 allocs/op
PASS
ok github.com/mwmahlberg/so-regex 61.340s
它清楚地表明,在拆分标签的情况下,实际使用拆分的速度是预编译正则表达式的两倍多。对于您的用例,我显然会选择使用 split。
- 1 回答
- 0 关注
- 76 浏览
添加回答
举报