willsonlincake 发表于 2022-4-14 19:45:23

Neattext字数统计

>>> import neattext as nt
>> mytext = "This is the mail example@gmail.com ,our WEBSITE is https://example.com &#128522."
>>> docx = nt.TextFrame(text=mytext)
>>> docx.text
"This is the mail example@gmail.com ,our WEBSITE is https://example.com &#128522."
>>>
>>> docx.describe()
Key      Value         
Length: 73            
vowels: 21            
consonants: 34            
stopwords: 4            
punctuations: 8            
special_char: 8            
tokens(whitespace): 10            
tokens(words): 14            
>>>
>>> docx.length
73
>>> # Scan Percentage of Noise(Unclean data) in text
>>> d.noise_scan()
{'text_noise': 19.17808219178082, 'text_length': 73, 'noise_count': 14}
>>>
>>> docs.head(16)
'This is the mail'
>>> docx.tail()
>>> docx.count_vowels()
>>> docx.count_stopwords()
>>> docx.count_consonants()
>>> docx.nlongest()
>>> docx.nshortest()
>>> docx.readability()
页: [1]
查看完整版本: Neattext字数统计