Remove background of image containing textFinding the index of an item given a list containing it in PythonDoes Python have a string 'contains' substring method?image processing to improve tesseract OCR accuracyImage Processing: Algorithm Improvement for 'Coca-Cola Can' RecognitionHow do I choose between Tesseract and OpenCV?OCR: Image to text?How to extracting line of text from image using OpenCVImprove quality of tesseract text recognitionImprove Tesseract OCR results with blurred textWhat's the way to remove all lines and borders in image(keep texts) programmatically?

Is there a good way to store credentials outside of a password manager?

Valid Badminton Score?

What are the ramifications of creating a homebrew world without an Astral Plane?

What defines a dissertation?

Where in the Bible does the greeting ("Dominus Vobiscum") used at Mass come from?

Trouble understanding overseas colleagues

Coordinate position not precise

There is only s̶i̶x̶t̶y one place he can be

Why are on-board computers allowed to change controls without notifying the pilots?

Is it correct to write "is not focus on"?

Time travel short story where a man arrives in the late 19th century in a time machine and then sends the machine back into the past

Is a roofing delivery truck likely to crack my driveway slab?

What to do with wrong results in talks?

Print name if parameter passed to function

Tiptoe or tiphoof? Adjusting words to better fit fantasy races

Why does John Bercow say “unlock” after reading out the results of a vote?

Modify casing of marked letters

How will losing mobility of one hand affect my career as a programmer?

Have I saved too much for retirement so far?

How does residential electricity work?

Why Were Madagascar and New Zealand Discovered So Late?

Transcription Beats per minute

How does a character multiclassing into warlock get a focus?

Irreducibility of a simple polynomial



Remove background of image containing text


Finding the index of an item given a list containing it in PythonDoes Python have a string 'contains' substring method?image processing to improve tesseract OCR accuracyImage Processing: Algorithm Improvement for 'Coca-Cola Can' RecognitionHow do I choose between Tesseract and OpenCV?OCR: Image to text?How to extracting line of text from image using OpenCVImprove quality of tesseract text recognitionImprove Tesseract OCR results with blurred textWhat's the way to remove all lines and borders in image(keep texts) programmatically?













2















I am building custom ocr for some documents. After getting ROI I am passing them to tesseract. To improve accuracy I want to remove background of image. I am observing that when there are images like this:
enter image description here



enter image description here



tesseract is not able to read anything.(Because of lines in the image)



But for images like this:enter image description here Its giving correct results.
Can any one suggest how to remove everything from image except text?










share|improve this question
























  • An heuristic approach would be to keep only the color of the text (and possibly a small region around them) and make the remaining white for example. Not sure it will work though.

    – Eypros
    Mar 7 at 11:43











  • That might work. Any guide how to do that?

    – 008karan
    Mar 7 at 13:11











  • @Eypros Loved the nerd calculator on your profile. You should start with thresholding the image to see if you can get rid of the colors.

    – Rick M.
    Mar 7 at 13:42











  • I tried thresolding but I am getting B&W image with those lines in it

    – 008karan
    Mar 7 at 14:04











  • With the free ocr api I get "JTFF" for the image, out of the box. Often using an Asian language like korean works better for tricky English(!) letters. So you can try this.

    – Jim Grigoryan
    Mar 7 at 19:23
















2















I am building custom ocr for some documents. After getting ROI I am passing them to tesseract. To improve accuracy I want to remove background of image. I am observing that when there are images like this:
enter image description here



enter image description here



tesseract is not able to read anything.(Because of lines in the image)



But for images like this:enter image description here Its giving correct results.
Can any one suggest how to remove everything from image except text?










share|improve this question
























  • An heuristic approach would be to keep only the color of the text (and possibly a small region around them) and make the remaining white for example. Not sure it will work though.

    – Eypros
    Mar 7 at 11:43











  • That might work. Any guide how to do that?

    – 008karan
    Mar 7 at 13:11











  • @Eypros Loved the nerd calculator on your profile. You should start with thresholding the image to see if you can get rid of the colors.

    – Rick M.
    Mar 7 at 13:42











  • I tried thresolding but I am getting B&W image with those lines in it

    – 008karan
    Mar 7 at 14:04











  • With the free ocr api I get "JTFF" for the image, out of the box. Often using an Asian language like korean works better for tricky English(!) letters. So you can try this.

    – Jim Grigoryan
    Mar 7 at 19:23














2












2








2


1






I am building custom ocr for some documents. After getting ROI I am passing them to tesseract. To improve accuracy I want to remove background of image. I am observing that when there are images like this:
enter image description here



enter image description here



tesseract is not able to read anything.(Because of lines in the image)



But for images like this:enter image description here Its giving correct results.
Can any one suggest how to remove everything from image except text?










share|improve this question
















I am building custom ocr for some documents. After getting ROI I am passing them to tesseract. To improve accuracy I want to remove background of image. I am observing that when there are images like this:
enter image description here



enter image description here



tesseract is not able to read anything.(Because of lines in the image)



But for images like this:enter image description here Its giving correct results.
Can any one suggest how to remove everything from image except text?







python opencv image-processing ocr tesseract






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 11 at 6:50







008karan

















asked Mar 7 at 11:35









008karan008karan

284




284












  • An heuristic approach would be to keep only the color of the text (and possibly a small region around them) and make the remaining white for example. Not sure it will work though.

    – Eypros
    Mar 7 at 11:43











  • That might work. Any guide how to do that?

    – 008karan
    Mar 7 at 13:11











  • @Eypros Loved the nerd calculator on your profile. You should start with thresholding the image to see if you can get rid of the colors.

    – Rick M.
    Mar 7 at 13:42











  • I tried thresolding but I am getting B&W image with those lines in it

    – 008karan
    Mar 7 at 14:04











  • With the free ocr api I get "JTFF" for the image, out of the box. Often using an Asian language like korean works better for tricky English(!) letters. So you can try this.

    – Jim Grigoryan
    Mar 7 at 19:23


















  • An heuristic approach would be to keep only the color of the text (and possibly a small region around them) and make the remaining white for example. Not sure it will work though.

    – Eypros
    Mar 7 at 11:43











  • That might work. Any guide how to do that?

    – 008karan
    Mar 7 at 13:11











  • @Eypros Loved the nerd calculator on your profile. You should start with thresholding the image to see if you can get rid of the colors.

    – Rick M.
    Mar 7 at 13:42











  • I tried thresolding but I am getting B&W image with those lines in it

    – 008karan
    Mar 7 at 14:04











  • With the free ocr api I get "JTFF" for the image, out of the box. Often using an Asian language like korean works better for tricky English(!) letters. So you can try this.

    – Jim Grigoryan
    Mar 7 at 19:23

















An heuristic approach would be to keep only the color of the text (and possibly a small region around them) and make the remaining white for example. Not sure it will work though.

– Eypros
Mar 7 at 11:43





An heuristic approach would be to keep only the color of the text (and possibly a small region around them) and make the remaining white for example. Not sure it will work though.

– Eypros
Mar 7 at 11:43













That might work. Any guide how to do that?

– 008karan
Mar 7 at 13:11





That might work. Any guide how to do that?

– 008karan
Mar 7 at 13:11













@Eypros Loved the nerd calculator on your profile. You should start with thresholding the image to see if you can get rid of the colors.

– Rick M.
Mar 7 at 13:42





@Eypros Loved the nerd calculator on your profile. You should start with thresholding the image to see if you can get rid of the colors.

– Rick M.
Mar 7 at 13:42













I tried thresolding but I am getting B&W image with those lines in it

– 008karan
Mar 7 at 14:04





I tried thresolding but I am getting B&W image with those lines in it

– 008karan
Mar 7 at 14:04













With the free ocr api I get "JTFF" for the image, out of the box. Often using an Asian language like korean works better for tricky English(!) letters. So you can try this.

– Jim Grigoryan
Mar 7 at 19:23






With the free ocr api I get "JTFF" for the image, out of the box. Often using an Asian language like korean works better for tricky English(!) letters. So you can try this.

– Jim Grigoryan
Mar 7 at 19:23













0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55042832%2fremove-background-of-image-containing-text%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55042832%2fremove-background-of-image-containing-text%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

AWS Lex not identifying response if by a variable The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) The Ask Question Wizard is Live! Data science time! April 2019 and salary with experienceEnforcing custom enumeration in AWS LEX for slot valuesHow to give response based on user response in Amazon Lex?Intercepting AWS Lambda Response to a AWS Lex QueryLex chat bot error: Reached second execution of fulfillment lambda on the same utteranceamazon lex showing invalid responseLambda response send back to Lex slot?Response card in Amazon lexAmazon Lex - Lambda response return HTML to botHow can I solve 424 (Failed Dependency) (python) obtained from Amazon lex?

Алба-Юлія

Захаров Федір Захарович