You could alternatively use the library “ PyMuPDF”. Here is a short snippet implementation of it.

Abhimanyu Tiwari

Venkat Raman

1 min readJul 2, 2019

import sys, fitz

def extractText(file):

doc = fitz.open(file)

text = []

for page in doc:

t = page.getText().encode(“utf8”)

text.append(t)

return text

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Venkat Raman

856 Followers

296 Following

Co-Founder of Aryma Labs. Data scientist/Statistician with business acumen. Hoping to amass knowledge and share it throughout my life. Rafa Nadal Fan.

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams