[Suggestion] Emoji recognition #27

kschelonka · 2019-07-03T16:07:36Z

Having to look up and copy-paste emoji slows down transcription. It might be useful for the OCR bot to output the emoji used in the text at the end of its comment. That way the user could copy-paste from the comment, rather than having to look up all the individual emoji.

A model could be trained to detect emoji separately from the OCR service, and invoked afterwards.

codingJWilliams · 2019-07-03T16:57:41Z

Sounds useful, however does ocr.space support that? If not, what other services are available to do something like this?

…

On Wed, Jul 3, 2019 at 5:07 PM Kat Schelonka ***@***.***> wrote: Having to look up and copy-paste emoji slows down transcription. It might be useful for the OCR bot to output the emoji used in the text at the end of its comment. That way the user could copy-paste from the comment, rather than having to look up all the individual emoji. We could train a model to detect emoji separately from the OCR service, and invoke it afterwards. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#27?email_source=notifications&email_token=ADXD7HZ4ROEMDQTZ6BP2OBLP5TFERA5CNFSM4H5HNIS2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G5FTFPA>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADXD7H53FL3BL4N2GJKNDGTP5TFERANCNFSM4H5HNISQ> .

-- *Best Regards,* *Jay Williams*

itsthejoker · 2019-07-03T17:47:53Z

Either my google skills are failing me horribly or this hasn't been done before. I think this is a fascinating idea, but I'm struggling to come up with ways that it's technologically feasible.

Emojipedia offers an API, but only for the raw data (i.e. you have to already know which emoji you're looking at)... the biggest problem is keeping track of all the different vendor interpretations of various emoji. What I suppose might work is training a standard OCR model on only emoji and using all the vendor implementations of every emoji as the data. It would be... very difficult... to insert them into the appropriate places, but we might be able to make a list at the bottom that just says "these are all the emoji in the above transcription: ❤️ 🎉 🍰 😍 " or something along those lines.

Something else worth keeping in mind is that OCR is very expensive in terms of computation time. When we originally started, we ran our own instance of Tesseract for a few months and suffered a 17 minute queue time. Not something I'd particularly like to do again.

itsthejoker · 2020-09-12T05:10:36Z

This might be possible with OpenCV -- I'm not sure that it's worth the effort as it looks like it will require effectively rolling our own OCR solution again, but I'm putting this in here so I don't lose it. https://stackoverflow.com/questions/35486522/ocr-an-ios-android-messaging-app-screenshot#comment74110958_35486522

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Suggestion] Emoji recognition #27

[Suggestion] Emoji recognition #27

kschelonka commented Jul 3, 2019 •

edited

Loading

codingJWilliams commented Jul 3, 2019 via email

itsthejoker commented Jul 3, 2019

itsthejoker commented Sep 12, 2020

[Suggestion] Emoji recognition #27

[Suggestion] Emoji recognition #27

Comments

kschelonka commented Jul 3, 2019 • edited Loading

codingJWilliams commented Jul 3, 2019 via email

itsthejoker commented Jul 3, 2019

itsthejoker commented Sep 12, 2020

kschelonka commented Jul 3, 2019 •

edited

Loading