一个 Swift 正则表达式类,封装了 NSRegularExpression
的一些复杂性
每次我必须在 Swift 中使用 NSRegularExpression
时,都会反复犯关于从 NSRange
到 Range<String.Index>
的范围和范围转换的错误。
此外,使用捕获组提取内容很繁琐,并且有点容易出错。我想抽象掉我反复出错的一些事情。
let inputText: String = <some text to match against>
// Build the regex to match against (in this case, <number>\t<string>)
// This regex has two capture groups, one for the number and one for the string.
let regex = try DSFRegex(#"(\d*)\t\"([^\"]+)\""#)
// Retrieve ALL the matches for the supplied text
let searchResult = regex.matches(for: inputText)
// Loop over each of the matches found, and print them out
searchResult.forEach { match in
let foundStr = inputText[match.range] // The text of the entire match
let numberVal = inputText[match.captures[0]] // Retrieve the first capture group text.
let stringVal = inputText[match.captures[1]] // Retrieve the second capture group text.
Swift.print("Number is \(numberVal), String is \(stringVal)")
}
'matches' 结果的基本结构如下
Matches
> matches: An array of regex matches
> range: A match range. This range specifies the match range within the original text being searched
> captures: An array of capture groups
> A capture range. This range represents the range of a capture within the original text being searched
所有返回给调用者的范围(反面,当将范围传递给正则表达式对象时)都是在匹配中传递的 Swift String
的范围内。
这是非常重要的,因为 NSRegularExpression
使用 NSString
,而 NSString
和 String
之间的代码点和字符范围信息是不同的,尤其是在处理高 Unicode 范围中的字符(如表情符号 🇦🇲 👨👩👦)时更为明显。
您可以使用构造函数和一个正则表达式模式创建一个匹配对象。如果正则表达式格式错误或无法编译,则该构造函数将引发异常。
// Match against dummy phone numbers XXXX-YYY-ZZZ
let phoneNumberRegex = try DSFRegex(#"(\d{4})-(\d{3})-(\d{3})"#)
要检查字符串是否与正则表达式匹配,请使用 hasMatch
方法。
let hasAMatch = phoneNumberRegex.hasMatch("0499-999-999") // true
let noMatch = phoneNumberRegex.hasMatch("0499 999 999") // false
如果您想提取所有匹配信息,请使用 matches
方法。
let result = phoneNumberRegex.matches(for: "0499-999-999 0491-111-444 4324-222-123")
result.forEach { match in
let matchText = result.text(for: match.element)
Swift.print("Match `\(matchText)`")
for capture in match.captures {
let captureText = result.text(for: capture)
Swift.print(" - `\(captureText)`")
}
}
如果您有一个大的输入文本或复杂的正则表达式需要一段时间才能处理,或者您有资源受限的内存条件,您可以选择枚举匹配结果而不是事先处理一切。
枚举方法允许您在任何时候或过程中停止处理(例如,如果您有时间限制,或者您正在寻找文本中的特定匹配项)。
/// Find all email addresses within a text
let inputString = "… some input string …"
let emailRegex = try DSFRegex("… some regex …")
emailRegex.enumerateMatches(in: inputString) { (match) -> Bool in
// Extract match information
let matchRange = match.range
let matchText = inputString[match.range]
Swift.print("Found '\(matchText)' at range \(matchRange)")
// Continue processing
return true
}
字符串搜索游标在您偶尔在字符串中进行搜索时很有用,例如,当用户点击“下一页”按钮时。游标跟踪当前匹配,并在查找字符串中的下一个匹配时使用。
var searchCursor: DSFRegex.Cursor?
var content: String
@IBAction func startSearch(_ sender: Any) {
let regex = DSFRegex(... some pattern ...)
// Find the first match in the string
self.searchCursor = self.content.firstMatch(for: regex)
self.displayForCurrentSearch()
}
@IBAction func nextSearchResult(_ sender: Any) {
if let previous = self.searchCursor {
// Find the next match in the string from the
self.searchCursor = self.content.nextMatch(for: previous)
}
self.displayForCurrentSearch()
}
internal func displayForCurrentSearch() {
// Update the UI reflecting the search result found in self.searchCursor
...
}
返回一个新字符串,其中匹配的正则表达式被模板字符串替换。
// Redact email addresses within the text
let emailRegex = try DSFRegex("… some regex …")
let redacted = emailRegex.stringByReplacingMatches(
in: inputString,
withTemplate: NSRegularExpression.escapedTemplate(for: "<REDACTED-EMAIL-ADDRESS>")
)
主要用于执行正则表达式匹配的主要类。
一个包含所有正则表达式匹配文本结果的类。它还提供了一些方法来帮助从匹配和/或捕获对象中提取文本。
一个匹配对象。存储匹配原文中的匹配范围。如果正则表达式中定义了捕获组,还包含一个捕获组对象的数组。
捕获表示正则表达式结果中匹配的捕获单例范围。每个 match
可能包含 0 个或多个捕获,具体取决于正则表达式中的可用捕获。
一个增量游标对象,用于通过字符串扩展进行搜索。
pod 'DSFRegex', :git => 'https://github.com/dagronf/DSFRegex/'
将 https://github.com/dagronf/DSFRegex
添加到您的项目中。
将 Sources/DSFRegex
中的文件复制到您的项目中
有关更多示例和用法,您可以在 Tests
文件夹中找到一系列测试。
let phoneNumberRegex = try DSFRegex(#"(\d{4})-(\d{3})-(\d{3})"#)
let results = phoneNumberRegex.matches(for: "4499-999-999 3491-111-444 4324-222-123")
// results.numberOfMatches == 3
// results.text(match: 0) == "4499-999-999"
// results.text(match: 1) == "3491-111-444"
// results.text(match: 2) == "4324-222-123"
// Just retrieve the text for each of the matches
let textMatches = results.textMatching() // == ["4499-999-999", "3491-111-444, "4324-222-123"]
如果您只对第一个匹配项感兴趣,请使用
let first = phoneNumberRegex.firstMatch(in: "4499-999-999 3491-111-444 4324-222-123")
let allMatches = phoneNumberRegex.matches(for: "0499-999-999 0491-111-444 4324-222-123")
for match in allMatches.matches.enumerated() {
let matchText = allMatches.text(for: match.element)
Swift.print("Match (\(match.offset)) -> `\(matchText)`")
for capture in match.element.capture.enumerated() {
let captureText = allMatches.text(for: capture.element)
Swift.print(" Capture (\(capture.offset)) -> `\(captureText)`")
}
}
输出 :-
Match (0) -> `0499-999-888`
Capture (0) -> `0499`
Capture (1) -> `999`
Capture (2) -> `888`
Match (1) -> `0491-111-444`
Capture (0) -> `0491`
Capture (1) -> `111`
Capture (2) -> `444`
Match (2) -> `4324-222-123`
Capture (0) -> `4324`
Capture (1) -> `222`
Capture (2) -> `123`
/// Find all email addresses within a text
let emailRegex = try DSFRegex("… some regex …")
let inputString = "This is a test.\n [email protected] and [email protected], [email protected] lives here"
var count = 0
emailRegex.enumerateMatches(in: inputString) { (match) -> Bool in
count += 1
// Extract match information
let matchRange = match.range
let nsRange = NSRange(matchRange, in: inputString)
let matchText = inputString[match.range]
Swift.print("\(count) - Found '\(matchText)' at range \(nsRange)")
// Stop processing if we've found more than two
return count < 2
}
输出 :-
1 - Found '[email protected]' at range {17, 30}
2 - Found '[email protected]' at range {52, 21}
MIT License
Copyright (c) 2024 Darren Ford
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.