3 回答
TA贡献1995条经验 获得超2个赞
以下正则表达式适用于所有上述示例:
public static void main(String[] args)
{
for (String w : "camelValue".split("(?<!(^|[A-Z]))(?=[A-Z])|(?<!^)(?=[A-Z][a-z])")) {
System.out.println(w);
}
}
它通过强制否定的向后看不仅在字符串的开头忽略匹配项,而且还忽略在大写字母后跟另一个大写字母的匹配项。这样可以处理“ VALUE”之类的情况。
正则表达式的第一部分本身由于无法在“ RPC”和“ Ext”之间分割而在“ eclipseRCPExt”上失败。这是第二个条款的目的:(?<!^)(?=[A-Z][a-z]。此子句允许在每个大写字母前跟一个小写字母前进行拆分,但字符串的开头除外。
TA贡献1804条经验 获得超3个赞
看来您正在使此过程变得比所需的更为复杂。对于camelCase,拆分位置仅是大写字母紧跟在小写字母之后的任何位置:
(?<=[a-z])(?=[A-Z])
这是此正则表达式如何拆分示例数据的方法:
value -> value
camelValue -> camel / Value
TitleValue -> Title / Value
VALUE -> VALUE
eclipseRCPExt -> eclipse / RCPExt
与所需输出的唯一区别是与eclipseRCPExt
,我认为这是在此处正确分割的。
附录-改进版本
注意:这个答案最近得到了好评,我意识到有更好的方法...
通过在上述正则表达式中添加第二种替代方法,可以正确拆分所有OP的测试用例。
(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])
这是改进的正则表达式如何拆分示例数据的方法:
value -> value
camelValue -> camel / Value
TitleValue -> Title / Value
VALUE -> VALUE
eclipseRCPExt -> eclipse / RCP / Ext
TA贡献1827条经验 获得超8个赞
我无法获得aix的解决方案(也不能在RegExr上运行),所以我想出了自己的经过测试的方法,似乎可以完全满足您的要求:
((^[a-z]+)|([A-Z]{1}[a-z]+)|([A-Z]+(?=([A-Z][a-z])|($))))
这是一个使用它的示例:
; Regex Breakdown: This will match against each word in Camel and Pascal case strings, while properly handling acrynoms.
; (^[a-z]+) Match against any lower-case letters at the start of the string.
; ([A-Z]{1}[a-z]+) Match against Title case words (one upper case followed by lower case letters).
; ([A-Z]+(?=([A-Z][a-z])|($))) Match against multiple consecutive upper-case letters, leaving the last upper case letter out the match if it is followed by lower case letters, and including it if it's followed by the end of the string.
newString := RegExReplace(oldCamelOrPascalString, "((^[a-z]+)|([A-Z]{1}[a-z]+)|([A-Z]+(?=([A-Z][a-z])|($))))", "$1 ")
newString := Trim(newString)
在这里,我用空格分隔每个单词,因此,下面是一些如何转换字符串的示例:
ThisIsATitleCASEString =>这是一个标题案例字符串
andThisOneIsCamelCASE =>而这一个是Camel CASE
上面的解决方案可以满足原始帖子的要求,但是我还需要一个正则表达式来查找包含数字的骆驼和帕斯卡字符串,因此我也想出了一种包含数字的变体:
((^[a-z]+)|([0-9]+)|([A-Z]{1}[a-z]+)|([A-Z]+(?=([A-Z][a-z])|($)|([0-9]))))
以及使用它的示例:
; Regex Breakdown: This will match against each word in Camel and Pascal case strings, while properly handling acrynoms and including numbers.
; (^[a-z]+) Match against any lower-case letters at the start of the command.
; ([0-9]+) Match against one or more consecutive numbers (anywhere in the string, including at the start).
; ([A-Z]{1}[a-z]+) Match against Title case words (one upper case followed by lower case letters).
; ([A-Z]+(?=([A-Z][a-z])|($)|([0-9]))) Match against multiple consecutive upper-case letters, leaving the last upper case letter out the match if it is followed by lower case letters, and including it if it's followed by the end of the string or a number.
newString := RegExReplace(oldCamelOrPascalString, "((^[a-z]+)|([0-9]+)|([A-Z]{1}[a-z]+)|([A-Z]+(?=([A-Z][a-z])|($)|([0-9]))))", "$1 ")
newString := Trim(newString)
以下是一些使用此正则表达式转换数字字符串的示例:
myVariable123 =>我的变量123
my2Variables =>我的2个变量
3rdVariableIsHere =>第3rdVariable在这里
12345NumsAtTheStartIncludedToo => 12345 Nums在开始时也包含
添加回答
举报