Have you ever wanted to copy-paste the text in an image file? by Allan K. '17
Well, now you can--thanks to a HackMIT project.
If you’re a regular on /r/technology, you might have seen Project Naptha trending the front page.
I’ve blogged a couple of times already about hackathons at MIT and how amazing they are in the amount of experience, learning, and motivation they can give you in the space of ten or twenty sleepless hours. Project Naptha, created by my classmate Kevin K. ’17, is a Chrome extension whose birth I witnessed (and blogged about) at HackMIT last October. Then called “ImagesAsText,” he took home second prize (and $2000) for his work.
What, exactly, does Project Naptha do?
From the website:
- Project Naptha automatically applies state-of-the-art computer vision algorithms on every image you see while browsing the web. The result is a seamless and intuitive experience, where you can highlight as well as copy and paste and even edit and translate the text formerly trapped within an image.
You can see it in action on the Project Naptha website. It’s pretty amazing. Essentially, it detects text from memes, image macros, screenshots, graphs, scanned images, and even certain photographs (yes, that includes rotated text) and treats it as highlightable text. Then, using some nifty image processing, it allows you to remove that text from the image entirely, or replace it with its translation in a different language, or replace it with whatever text you desire. There’s even an Easter egg that gives the selected text an animation. It does all of this image processing in real time, with no discernible lag or delay. It can do this for pretty much every language on the Internet, including non-latinate languages like Chinese and Russian.
Kevin is kind of a hacker extraordinaire.
Project Naptha has since been picked up by PCWorld, the Verge, Engadget, and CNET. CNET calls it “nothing short of JavaScript black magic.” Since its release yesterday, it has had upward of 52,000 downloads (as of 5:09 PM EDT on April 25).
It’s not perfect, but it comes very, very close, really limited only by the technologies of optical character recognition, translation quality, and the image processing that fills in the image where text is removed (think Photoshop’s Content Aware Fill).
The only question really left is–can it do Captchas?
Signing off,
Allan
Add Project Naptha to Chrome here. For demos and backstory, check out the website.