Intelligently Extract Text & Data from Document with OCR NER
Intelligently Extract Text & Data from Document with OCR NER, available at $69.99, has an average rating of 4.65, with 92 lectures, based on 374 reviews, and has 3408 subscribers.
You will learn about Develop and Train Named Entity Recognition Model Not only Extract text from the Image but also Extract Entities from Business Card Develop Business Card Scanner like ABBY from Scratch High Level Data Preprocess Techniques for Natural Language Problem Real Time NER apps This course is ideal for individuals who are Anyone who wants to Develop Business Card Reader App or Data Scientist, Analyst, Python Develop who want to enhance skills in NLP It is particularly useful for Anyone who wants to Develop Business Card Reader App or Data Scientist, Analyst, Python Develop who want to enhance skills in NLP.
Enroll now: Intelligently Extract Text & Data from Document with OCR NER
Summary
Title: Intelligently Extract Text & Data from Document with OCR NER
Price: $69.99
Average Rating: 4.65
Number of Lectures: 92
Number of Published Lectures: 89
Number of Curriculum Items: 92
Number of Published Curriculum Objects: 89
Original Price: $129.99
Quality Status: approved
Status: Live
What You Will Learn
- Develop and Train Named Entity Recognition Model
- Not only Extract text from the Image but also Extract Entities from Business Card
- Develop Business Card Scanner like ABBY from Scratch
- High Level Data Preprocess Techniques for Natural Language Problem
- Real Time NER apps
Who Should Attend
- Anyone who wants to Develop Business Card Reader App
- Data Scientist, Analyst, Python Develop who want to enhance skills in NLP
Target Audiences
- Anyone who wants to Develop Business Card Reader App
- Data Scientist, Analyst, Python Develop who want to enhance skills in NLP
Welcome to Course “Intelligently Extract Text & Data from Document with OCR NER” !!!
In this course you will learn how to develop customized Named Entity Recognizer. The main idea of this course is to extract entities from the scanned documents like invoice, Business Card, Shipping Bill, Bill of Lading documents etc. However, for the sake of data privacy we restricted our views to Business Card. But you can use the framework explained to all kinds of financial documents. Below given is the curriculum we are following to develop the project.
To develop this project we will use two main technologies in data science are,
-
Computer Vision
-
Natural Language Processing
In Computer Vision module, we will scan the document, identify the location of text and finally extract text from the image. Then in Natural language processing, we will extract the entitles from the text and do necessary text cleaning and parse the entities form the text.
Python Libraries used in Computer Vision Module.
-
OpenCV
-
Numpy
-
Pytesseract
Python Libraries used in Natural Language Processing
-
Spacy
-
Pandas
-
Regular Expression
-
String
As are combining two major technologies to develop the project, for the sake of easy to understand we divide the course into several stage of development.
Stage -1:We will setup the project by doing the necessary installations and requirements.
-
Install Python
-
Install Dependencies
Stage -2:We will do data preparation. That is we will extract text from images using Pytesseract and also do necessary cleaning.
-
Gather Images
-
Overview on Pytesseract
-
Extract Text from all Image
-
Clean and Prepare text
Stage -3:We will see how to label NER data using BIO tagging.
-
Manually Labeling with BIO technique
-
B – Beginning
-
I – Inside
-
O – Outside
-
Stage -4: We will further clean the text and preprocess the data for to train machine learning.
-
Prepare Training Data for Spacy
-
Convert data into spacy format
Stage -5:With the preprocess data we will train the Named Entity model.
-
Configuring NER Model
-
Train the model
Stage -6: We will predict the entitles using NER and model and create data pipeline for parsing text.
-
Load Model
-
Render and Serve with Displacy
-
Draw Bounding Box on Image
-
Parse Entitles from Text
Finally, we will put all together and create document scanner app.
Are you ready !!!
Let start developing the Artificial Intelligence project.
Course Curriculum
Chapter 1: Introduction
Lecture 1: Introduction
Lecture 2: Project Plan
Lecture 3: Project Document
Lecture 4: Download the Resources
Lecture 5: Facing any Issue with the Course ? Here is the solution
Chapter 2: Project Setup
Lecture 1: Install Python
Lecture 2: Install Virtual Environment
Lecture 3: Install Packages into Virtual Environment
Lecture 4: Install Tesseract OCR & Pytesseract
Lecture 5: Install spaCy
Lecture 6: Test, the packages are installed
Chapter 3: Data Preparation
Lecture 1: Load Business Card using OpenCV & PIL
Lecture 2: Pytesseract: Extract text from Image
Lecture 3: Pytesseract: Tesseract Error
Lecture 4: Pytesseract: How Pytesseract with work ?
Lecture 5: Pytesseract: Image to text to dataframe
Lecture 6: Pytesseract: Clean Text in Dataframe
Lecture 7: Pytesseract: Draw Bounding Box around each word
Lecture 8: Extract Text and Data from all Business Card
Lecture 9: Save data in csv
Lecture 10: Labeling
Chapter 4: Data Preprocessing and Cleaning
Lecture 1: Spacy Training Data Format
Lecture 2: Load Data and convert into Pandas DataFrame
Lecture 3: Updated Code.
Lecture 4: Cleaning Text
Lecture 5: Convert Data into spacy format
Lecture 6: Testing Entities
Lecture 7: Convert data into spacy format for all Business card text
Lecture 8: Splitting Data into Training and Testing Set
Chapter 5: Train Named Entity Recognition (NER) model
Lecture 1: Spacy: Fill the Configuration
Lecture 2: Spacy: Prepare Data
Lecture 3: Spacy: Train NER pipeline model
Lecture 4: Spacy: Save NER Model
Chapter 6: Predictions
Lecture 1: Import Required Libraries
Lecture 2: Clean Text Function
Lecture 3: Load Spacy NER Model
Lecture 4: Extract Text from Image and Convert into Data Frame
Lecture 5: Convert Data Frame into Content
Lecture 6: Get Named Entities from model
Lecture 7: Displacy render
Lecture 8: Tagging Each Word
Lecture 9: Join Label to tokens dataframe
Lecture 10: Join token dataframe with Pytesseract data
Lecture 11: Bounding Box and Tagging Predicted Entities
Lecture 12: Combine the BIO information
Lecture 13: Bounding Box
Lecture 14: Parsing Function
Lecture 15: Testing
Lecture 16: Parse Entitles
Lecture 17: Predictions Function
Lecture 18: Final Prediction Pipeline
Chapter 7: Improve Model Performance
Lecture 1: Ideas to Improve model accuracy
Lecture 2: Version-2 model framework: Data Preprocessing
Lecture 3: Train Version 2 model
Lecture 4: Get Predictions from the model
Chapter 8: Document Scanner
Lecture 1: Download the Resources
Lecture 2: What and Why Document Scanner in OpenCV ?
Lecture 3: Setup and Read Image
Lecture 4: Resize Image with same aspect ratio
Lecture 5: Edge Detection (Enhance, Blur and Canny) to Document
Lecture 6: Dilate Edges with morphological transform
Lecture 7: Find Four Point Countours (Identify Location of document)
Lecture 8: Apply Wrap transform and crop only document
Lecture 9: Document Scanner Function: Putting All together
Lecture 10: Magic Color to Image
Lecture 11: Integrate NER Predictions
Chapter 9: Document Scanner Web App
Lecture 1: What will you Develop ?
Lecture 2: Download Web App
Lecture 3: Setting Up Web App Project
Lecture 4: Install VS Code
Lecture 5: Install Flask
Lecture 6: First Flask App
Lecture 7: Run HTML file with Flask server
Lecture 8: Our Web App design steps
Lecture 9: Step-1: Design Page: Create Navigation Bar in HTML
Lecture 10: Step-1: Create About Page
Lecture 11: Step-2: Create HTML form to Upload Image or File in HTML
Lecture 12: Step-3: How to Predict document coordinates with Python in Flask
Lecture 13: Step-2: Upload and save image Backend : create settings.py
Lecture 14: Step-2: Upload and save image Backend: save image from HTML form
Lecture 15: Step-3: Document Scanning
Lecture 16: Adjust coordinates of document using JavaScript
Lecture 17: Wrap and Crop the document and save the image
Lecture 18: Get Predictions
Lecture 19: Design Predictions page
Lecture 20: Display results in table
Lecture 21: Final
Chapter 10: Appendix
Lecture 1: Limitations of Pytesseract
Chapter 11: BONUS
Lecture 1: Bonus Lecture: Next Steps
Instructors
-
G Sudheer
Instructor -
datascience Anywhere
Team of Engineers -
Brightshine Learn
Instructor Team
Rating Distribution
- 1 stars: 2 votes
- 2 stars: 5 votes
- 3 stars: 32 votes
- 4 stars: 115 votes
- 5 stars: 220 votes
Frequently Asked Questions
How long do I have access to the course materials?
You can view and review the lecture materials indefinitely, like an on-demand channel.
Can I take my courses with me wherever I go?
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!
You may also like
- Top 10 Language Learning Courses to Learn in November 2024
- Top 10 Video Editing Courses to Learn in November 2024
- Top 10 Music Production Courses to Learn in November 2024
- Top 10 Animation Courses to Learn in November 2024
- Top 10 Digital Illustration Courses to Learn in November 2024
- Top 10 Renewable Energy Courses to Learn in November 2024
- Top 10 Sustainable Living Courses to Learn in November 2024
- Top 10 Ethical AI Courses to Learn in November 2024
- Top 10 Cybersecurity Fundamentals Courses to Learn in November 2024
- Top 10 Smart Home Technology Courses to Learn in November 2024
- Top 10 Holistic Health Courses to Learn in November 2024
- Top 10 Nutrition And Diet Planning Courses to Learn in November 2024
- Top 10 Yoga Instruction Courses to Learn in November 2024
- Top 10 Stress Management Courses to Learn in November 2024
- Top 10 Mindfulness Meditation Courses to Learn in November 2024
- Top 10 Life Coaching Courses to Learn in November 2024
- Top 10 Career Development Courses to Learn in November 2024
- Top 10 Relationship Building Courses to Learn in November 2024
- Top 10 Parenting Skills Courses to Learn in November 2024
- Top 10 Home Improvement Courses to Learn in November 2024