Python实现离线文字识别

电脑技术 电脑技术 1075 人阅读 | 1 人回复 | 2022-03-28

马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。

您需要 登录 才可以下载或查看,没有账号?立即注册

x
Python有很多库可以进行文字识别,但不少都是要联网的,而且有的还需要翻墙才能用.对于需要离线进行的文字识别,pytesseract正好可以解决这个痛点.
要使用pytesseract,先要安装Tesseract,该软件有32位和64位,可以在下面的网址下载.
32位:https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w32-setup-v5.0.1.20220118.exe
64位:https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v5.0.1.20220118.exe
下载完成后安装即可.安装包还提供了可供下载的中文简体语言包,也可以联网进行下载.
安装完成后打开安装路径,将它添加到系统变量PATH中.
下面需要安装pytesseract和Pillow两个库.用pip命令安装.
安装完成后我们可以写个程序验证下这个库的功能.我们以下面这幅图片为例.
sim.jpg

新建一个Python文件,输入以下代码:
  1. from PIL import Image
  2. import pytesseract
  3. print(pytesseract.image_to_string(Image.open('E:\sim.jpg')))
复制代码
保存并运行程序,可以查看结果:
  1. ‘Typical Cover Letter Format

  2. Your Address
  3. ‘Your Contact Information

  4. Date

  5. Contact Name (i available)
  6. Contact Title

  7. ‘Compary Name
  8. ‘Company Address

  9. Dear Mr/Ms,/Dr.(f no contact, you can say “Human Resources Manager, or Hiring Manager"),

  10. ‘The first paragraph is an introduction of yourself and how you learned of the opening, as well as your
  11. interest in the postionforganization. This requires you to relate yourself to the organization or tothe
  12. postion in order to demonstrate your interest.

  13. The middle paragraph(s) is a profile of how your skils and experience match the qualifications
  14. ‘sought. In order to do this, consider the following points:

  15. Read the job description carefully to get a clear idea of what the company is looking for. This goes
  16. beyond just the “qualifications” section of a job description- make sure to discuss your abilty to do
  17. the job.

  18. Review the company website to learn what type of person the company might value.

  19. Match your background, whether itis work experience, academics, volunteer experience, etc. and
  20. describe why you believe these experiences make you a qualified candidate for the positon,

  21. ‘The last paragraph wraps up the cover letter. You should reiterate your interest in the pasion, and
  22. <desire to hear from them regarding the opportunity. You also want to thank the reader for ther time in
  23. ‘considering your application, and provide information for how you can be reached. if you would ike,
  24. and are able to, you can state that you wil follow-up with them directly. Be positive and confident
  25. (without being arrogant).

  26. Sincerely,

  27. ‘Signed Signature ({f a physical copy is being sent)

  28. Name (Typed)
复制代码
我们还可以利用Tkinter结合这个库制作一个GUI文字识别程序.后期本人打算基于这个库开发一个Scribus文字识别插件.


回答|共 1 个

释清心 发表于 2022-3-30 15:08:03| 字数 7 | 显示全部楼层

good job
无上甚深微妙法,百千万劫难遭遇;我今见闻得受持,愿解如来真实义。
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

热门推荐