PythonRe正则表达式详解

xiaoxiao2025-04-25  62

下图列出了Python支持的正则表达式元字符和语法参考文档:

1.1数量词的贪婪模式与非贪婪模式

正则表达式通常用于在文本中查找匹配的字符串。Python里数量词默认是贪婪的(在少数语言里也可能是默认非贪婪),总是尝试匹配尽可能多的字符;非贪婪的则相反,总是尝试匹配尽可能少的字符。例如:正则表达式"ab*“如果用于查找"abbbc”,将找到"abbb"。而如果使用非贪婪的数量词"ab*?",将找到"a"。

2.re模块 2.1 开始使用re Python通过re模块提供对正则表达式的支持。使用re的一般步骤是先将正则表达式的字符串形式编译为Pattern实例,然后使用Pattern实例处理文本并获得匹配结果(一个Match实例),最后使用Match实例获得信息,进行其他的操作。

# encoding: UTF-8 import re # 将正则表达式编译成Pattern对象 pattern = re.compile(r'hello') # 使用Pattern匹配文本,获得匹配结果,无法匹配时将返回None match = pattern.match('hello world!') if match: # 使用Match获得分组信息 print match.group() ### 输出 ### # hello

特殊操作

1、**?:**操作:选择要提取的子字符串

>>> re.findall(r'^.*(ing|ly|ed|ious|ies|ive|es|s|ment)$','processing') >['ing'] >>> re.findall(r'^.*(?:ing|ly|ed|ious|ies|ive|es|s|ment)$','processing') >['processing']

2、搜索一个文本中的多个词:"".找出文本中的多有 a man 的实例

>>> moby=["a monied man","a nervous man"] >>> moby.findall(r"<a>(<.*>)<man>") > monied;nervous;

搜索"x and other ys"的形式表达式

>>> re.findall(r"<\w*><and><other><\w*s>",words) >speed and other activities;water and other liduids;

前向界定符

https://blog.csdn.net/lilongsy/article/details/78505309

含义语法示例前向搜索肯定模式零宽度正预测先行断言匹配exp前面的位置(?=exp)用\b\w+(?=ing\b)查找I’m singing while you’re dancing.匹配到sing danc前向搜索否定模式零宽度负预测先行断言匹配后面跟的不是exp的位置(?!exp)\d{3}(?!\d)匹配三位数字,而且这三位数字的后面不能是数字;\b((?!abc)\w)+\b匹配不包含连续字符串abc的单词后向搜索肯定模式零宽度正回顾后发断言匹配exp后面的位置(?<=exp)用(?<=\bre)\w+\b查找reading a book得到ading后向搜索否定模式零宽度负回顾后发断言匹配前面不是exp的位置(?<!exp)(?<![a-z])\d{7}匹配前面不是小写字母的七位数字

代码示例:

text = "I play on playground. It is the best ground." positivelookaheadobjpattern = re.findall(r'play(?=ground)',text,re.M | re.I) print "Positive lookahead: " + str(positivelookaheadobjpattern) >>> Positive lookahead: ['play'] positivelookaheadobj = re.search(r'play(?=ground)',text,re.M | re.I) print "Positive lookahead character index: "+ str(positivelookaheadobj.span()) >>> Positive lookahead character index: (10, 14) negativelookaheadobjpattern = re.findall(r'play(?!ground)', text, re.M | re.I) print "Negative lookahead: " + str(negativelookaheadobjpattern) >>> Negative lookahead: ['play'] negativelookaheadobj = re.search(r'play(?!ground)', text, re.M | re.I) print "Negative lookahead character index: " + str(negativelookaheadobj.span()) >>> Negative lookahead character index: (2, 6) possitivelookbehindobjpattern = re.findall(r'(?<=play)ground',text,re.M | re.I) print "Positive lookbehind: " + str(possitivelookbehindobjpattern) >>> Positive lookbehind: ['ground'] possitivelookbehindobj = re.search(r'(?<=play)ground',text,re.M | re.I) print "Positive lookbehind character index: " + str(possitivelookbehindobj.span()) >>> Positive lookbehind character index: (14, 20) negativelookbehindobjpattern = re.findall(r'(?<!play)ground', text, re.M | re.I) print "negative lookbehind: " + str(negativelookbehindobjpattern) >>> negative lookbehind: ['ground'] negativelookbehindobj = re.search(r'(?<!play)ground', text, re.M | re.I) print "Negative lookbehind character index: " + str(negativelookbehindobj.span()) >>> Negative lookbehind character index: (37, 43)
转载请注明原文地址: https://www.6miu.com/read-5029079.html

最新回复(0)