DSFRegex 3.4.0

DSFRegex 3.4.0

Darren Ford 维护。



DSFRegex 3.4.0

  • 作者
  • Darren Ford

DSFRegex

一个 Swift 正则表达式类,封装了 NSRegularExpression 的一些复杂性

tag swift versions Platform support License MIT Build

为什么?

每次我必须在 Swift 中使用 NSRegularExpression 时,都会反复犯关于从 NSRangeRange<String.Index> 的范围和范围转换的错误。

此外,使用捕获组提取内容很繁琐,并且有点容易出错。我想抽象掉我反复出错的一些事情。

简洁地说 - 给我看看吧!

let inputText: String = <some text to match against>

// Build the regex to match against (in this case, <number>\t<string>)
// This regex has two capture groups, one for the number and one for the string.
let regex = try DSFRegex(#"(\d*)\t\"([^\"]+)\""#)

// Retrieve ALL the matches for the supplied text
let searchResult = regex.matches(for: inputText)

// Loop over each of the matches found, and print them out
searchResult.forEach { match in 
   let foundStr = inputText[match.range]          // The text of the entire match
   let numberVal = inputText[match.captures[0]]   // Retrieve the first capture group text.
   let stringVal = inputText[match.captures[1]]   // Retrieve the second capture group text.

   Swift.print("Number is \(numberVal), String is \(stringVal)")
}

'matches' 结果的基本结构如下

Matches
  > matches: An array of regex matches
    > range: A match range. This range specifies the match range within the original text being searched
    > captures: An array of capture groups
       > A capture range. This range represents the range of a capture within the original text being searched

使用

所有返回给调用者的范围(反面,当将范围传递给正则表达式对象时)都是在匹配中传递的 Swift String 的范围内。

这是非常重要的,因为 NSRegularExpression 使用 NSString,而 NSStringString 之间的代码点和字符范围信息是不同的,尤其是在处理高 Unicode 范围中的字符(如表情符号 🇦🇲 👨‍👩‍👦)时更为明显。

创建

您可以使用构造函数和一个正则表达式模式创建一个匹配对象。如果正则表达式格式错误或无法编译,则该构造函数将引发异常。

// Match against dummy phone numbers XXXX-YYY-ZZZ
let phoneNumberRegex = try DSFRegex(#"(\d{4})-(\d{3})-(\d{3})"#)

匹配

要检查字符串是否与正则表达式匹配,请使用 hasMatch 方法。

let hasAMatch = phoneNumberRegex.hasMatch("0499-999-999")   // true
let noMatch = phoneNumberRegex.hasMatch("0499 999 999")     // false

如果您想提取所有匹配信息,请使用 matches 方法。

let result = phoneNumberRegex.matches(for: "0499-999-999 0491-111-444 4324-222-123")
result.forEach { match in 
   let matchText = result.text(for: match.element)
   Swift.print("Match `\(matchText)`")
   for capture in match.captures {
      let captureText = result.text(for: capture)
      Swift.print(" - `\(captureText)`")
   }
}

枚举

如果您有一个大的输入文本或复杂的正则表达式需要一段时间才能处理,或者您有资源受限的内存条件,您可以选择枚举匹配结果而不是事先处理一切。

枚举方法允许您在任何时候或过程中停止处理(例如,如果您有时间限制,或者您正在寻找文本中的特定匹配项)。

/// Find all email addresses within a text
let inputString = "… some input string …"
let emailRegex = try DSFRegex("… some regex …")
emailRegex.enumerateMatches(in: inputString) { (match) -> Bool in

   // Extract match information
   let matchRange = match.range
   let matchText = inputString[match.range]
   Swift.print("Found '\(matchText)' at range \(matchRange)")
   
   // Continue processing
   return true
}

字符串搜索游标

字符串搜索游标在您偶尔在字符串中进行搜索时很有用,例如,当用户点击“下一页”按钮时。游标跟踪当前匹配,并在查找字符串中的下一个匹配时使用。

var searchCursor: DSFRegex.Cursor?
var content: String

@IBAction func startSearch(_ sender: Any) {
   let regex = DSFRegex(... some pattern ...)
   
   // Find the first match in the string
   self.searchCursor = self.content.firstMatch(for: regex)
   
   self.displayForCurrentSearch()
}

@IBAction func nextSearchResult(_ sender: Any) {
   if let previous = self.searchCursor {
   	   // Find the next match in the string from the 
      self.searchCursor = self.content.nextMatch(for: previous)
   }
   self.displayForCurrentSearch()
}

internal func displayForCurrentSearch() {
   // Update the UI reflecting the search result found in self.searchCursor
   ...
}

匹配字符串替换

返回一个新字符串,其中匹配的正则表达式被模板字符串替换。

// Redact email addresses within the text
let emailRegex = try DSFRegex("… some regex …")
let redacted = emailRegex.stringByReplacingMatches(
    in: inputString,
    withTemplate: NSRegularExpression.escapedTemplate(for: "<REDACTED-EMAIL-ADDRESS>")
)

DSFRegex

主要用于执行正则表达式匹配的主要类。

DSFRegex.Matches

一个包含所有正则表达式匹配文本结果的类。它还提供了一些方法来帮助从匹配和/或捕获对象中提取文本。

DSFRegex.Match

一个匹配对象。存储匹配原文中的匹配范围。如果正则表达式中定义了捕获组,还包含一个捕获组对象的数组。

DSFRegex.Capture

捕获表示正则表达式结果中匹配的捕获单例范围。每个 match 可能包含 0 个或多个捕获,具体取决于正则表达式中的可用捕获。

DSFRegex.Cursor

一个增量游标对象,用于通过字符串扩展进行搜索。

集成

Cocoapods

pod 'DSFRegex', :git => 'https://github.com/dagronf/DSFRegex/'

Swift 包管理器

https://github.com/dagronf/DSFRegex 添加到您的项目中。

直接

Sources/DSFRegex 中的文件复制到您的项目中

示例

有关更多示例和用法,您可以在 Tests 文件夹中找到一系列测试。

手机号码匹配

let phoneNumberRegex = try DSFRegex(#"(\d{4})-(\d{3})-(\d{3})"#)
let results = phoneNumberRegex.matches(for: "4499-999-999 3491-111-444 4324-222-123")

// results.numberOfMatches == 3
// results.text(match: 0) == "4499-999-999"
// results.text(match: 1) == "3491-111-444"
// results.text(match: 2) == "4324-222-123"

// Just retrieve the text for each of the matches
let textMatches = results.textMatching()  // == ["4499-999-999", "3491-111-444, "4324-222-123"]

如果您只对第一个匹配项感兴趣,请使用

let first = phoneNumberRegex.firstMatch(in: "4499-999-999 3491-111-444 4324-222-123")

数据提取

let allMatches = phoneNumberRegex.matches(for: "0499-999-999 0491-111-444 4324-222-123")
for match in allMatches.matches.enumerated() {
   let matchText = allMatches.text(for: match.element)
   Swift.print("Match (\(match.offset)) -> `\(matchText)`")
   for capture in match.element.capture.enumerated() {
      let captureText = allMatches.text(for: capture.element)
      Swift.print("  Capture (\(capture.offset)) -> `\(captureText)`")
   }
}

输出 :-

Match (0) -> `0499-999-888`
  Capture (0) -> `0499`
  Capture (1) -> `999`
  Capture (2) -> `888`
Match (1) -> `0491-111-444`
  Capture (0) -> `0491`
  Capture (1) -> `111`
  Capture (2) -> `444`
Match (2) -> `4324-222-123`
  Capture (0) -> `4324`
  Capture (1) -> `222`
  Capture (2) -> `123`

打印文本中的前两个电子邮件地址

/// Find all email addresses within a text
let emailRegex = try DSFRegex("… some regex …")
let inputString = "This is a test.\n [email protected] and [email protected], [email protected] lives here"

var count = 0
emailRegex.enumerateMatches(in: inputString) { (match) -> Bool in
   
   count += 1

   // Extract match information
   let matchRange = match.range
   let nsRange = NSRange(matchRange, in: inputString)
   let matchText = inputString[match.range]
   Swift.print("\(count) - Found '\(matchText)' at range \(nsRange)")

   // Stop processing if we've found more than two
   return count < 2
}

输出 :-

1 - Found '[email protected]' at range {17, 30}
2 - Found '[email protected]' at range {52, 21}

许可协议

MIT License

Copyright (c) 2024 Darren Ford

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.