Python 自然语言处理 第一章

xiaoxiao2021-02-28  124

from nltk.book import * *** Introductory Examples for the NLTK Book *** Loading text1, ..., text9 and sent1, ..., sent9 Type the name of the text or sentence to view it. Type: 'texts()' or 'sents()' to list the materials. text1: Moby Dick by Herman Melville 1851 text2: Sense and Sensibility by Jane Austen 1811 text3: The Book of Genesis text4: Inaugural Address Corpus text5: Chat Corpus text6: Monty Python and the Holy Grail text7: Wall Street Journal text8: Personals Corpus text9: The Man Who Was Thursday by G . K . Chesterton 1908

注意:若报错,将nltk存在报错信息中的指定目录里

concordance()用来查找单词,只能查单个词

similar()用来查找与参数上下文相同的词都有哪些,参数为单个词。上下文相同的词在NLP中被称为相似词

common_contexts()用来查找多个词所具有的共同上下文,参数为list形式

generate()可以用文本中的词汇随机产生一段文本

text1.concordance("Dick") Displaying 25 of 84 matches: Dick by Herman Melville 1851 ] ETYMOLOGY must be the same that some call Moby Dick ." " Moby Dick ?" shouted Ahab . " D e that some call Moby Dick ." " Moby Dick ?" shouted Ahab . " Do ye know the w Death and devils ! men , it is Moby Dick ye have seen -- Moby Dick -- Moby Di it is Moby Dick ye have seen -- Moby Dick -- Moby Dick !" " Captain Ahab ," sa ck ye have seen -- Moby Dick -- Moby Dick !" " Captain Ahab ," said Starbuck , Captain Ahab , I have heard of Moby Dick -- but it was not Moby Dick that too of Moby Dick -- but it was not Moby Dick that took off thy leg ?" " Who told my hearties all round ; it was Moby Dick that dismasted me ; Moby Dick that b s Moby Dick that dismasted me ; Moby Dick that brought me to this dead stump I white whale ; a sharp lance for Moby Dick !" " God bless ye ," he seemed to ha white whale ? art not game for Moby Dick ?" " I am game for his crooked jaw , l whaleboat ' s bow -- Death to Moby Dick ! God hunt us all , if we do not hun hunt us all , if we do not hunt Moby Dick to his death !" The long , barbed st owels to feel fear ! CHAPTER 41 Moby Dick . I , Ishmael , was one of that crew ividualizing tidings concerning Moby Dick . It was hardly to be doubted , that on must have been no other than Moby Dick . Yet as of late the Sperm Whale fis ident ignorantly gave battle to Moby Dick ; such hunters , perhaps , for the m g and piling their terrors upon Moby Dick ; those things had gone far to shake ies , which eventually invested Moby Dick with new terrors unborrowed from any rmen recalled , in reference to Moby Dick , the earlier days of the Sperm Whal ngs were ready to give chase to Moby Dick ; and a still greater number who , c was the unearthly conceit that Moby Dick was ubiquitous ; that he had actuall their superstitions ; declaring Moby Dick not only ubiquitous , but immortal ( shaped lower jaw beneath him , Moby Dick had reaped away Ahab ' s leg , as a

len(text)获取text文本中长度(其中包括标点符号)(相当于语料)

set(text)获取text中不重复元素的个数(相当于字典)

sort(set(text))对set(text)中的内容进行排序(标点符号在前面,按字母顺序排序)

text.count(“word”)计算text中word出现的次数

链表:1.多个链表的链接 +

2.在链表中增加元素 append()

3.索引、切片

sent1 ['Call', 'me', 'Ishmael', '.']

字符串与链表之间的转换:jion()和split

sentence=' '.join(['OH','MY','GOD']) print sentence OH MY GOD split_sentence=sentence.split() print split_sentence ['OH', 'MY', 'GOD'] saying=['After','all','is','said','and','done','more','is','said','than','done'] tokens=sorted(set(saying))[-2:] print tokens ['said', 'than'] fdist1=FreqDist(text1) print fdist1 <FreqDist with 19317 samples and 260819 outcomes> for key in fdist1: print key,fdist1[key] funereal 1 unscientific 1 divinely 2 foul 11 four 74 gag 2 prefix 1 ……
转载请注明原文地址: https://www.6miu.com/read-58605.html

最新回复(0)