Using Waifu2x to Upscale Japanese Prints

In my spare time I’ve been working on a database of Japanese prints for a little over 3.5 years now. I’m fully aware that I’ve never actually written about this, personally very important, project on my blog — until now. Unfortunately this isn’t a post explaining that project. I do still hope to write more about the intricacies of it, some day, but until that time you can watch this talk I gave at OpenVisConf 2014 about the site and the tech behind it.

I’ve been doing a lot of work exploring different computer vision, and machine learning, algorithms to see how they might apply to the world of art history study. I’m especially interested in finding novel uses of technology that could greatly benefit art historians in their work and also help individuals better appreciate art.

One tool that I came across yesterday is called Waifu2x. It’s a convolutional neural network (CNN) that is designed to optimally “upscale” images (taking small images and generating a larger image). The creator of this tool built it to better upscale poorly-sized Anime images and video. This is an effort that I can massively cheer on – while I’m not a purveyor of Anime, I love applying algorithmic overkill to non-tech hobbies.

Waifu2x also provides a live demo site that you can use to test it on other images. When I saw this I became immediately intrigued. Anime has direct stylistic influences drawn from the “old” world of Japanese woodblock printing, which was popular from the late 1600s to the late 1800s. Maybe the pre-trained upscaler could also work well for photos of Japanese prints? (Naturally I could train a new CNN to do this, but it may not even be necessary!)

Now the first questions that should come up, before even attempting upscale Japanese prints, are simply: Are there enough tiny images of prints that need to be made bigger? Who will benefit from this?

To answer those questions: Unfortunately there are tons of tiny pictures of Japanese prints in the world. To provide one very real example: The Tokyo National Museum has one of the greatest collections of Japanese prints in the world… none of which are (publicly) digitized. If a researcher wants to see if a the TNM has a particular copy of a print they’ll need to use the following 3 volume set of books (which I own):

TNM Books

Inside the book is 3,926 small, black-and-white, scans of every print in their collection:

Inside of TNM Book

I plan on digitizing these books and bringing these (not-ideal) prints online as they will be of the utmost use to scholars. However given their tiny size it will be very hard for most researchers to be able to make out what exactly the print is depicting. Thus any technology that is able to upscale the images to make them a bit easier to view would be greatly appreciated.

I began by experimenting with a few different, existing, print images and was really intrigued by the results.

I started with a primitive Ukiyo-e print by Hishikawa Moronobu:

(Source Image)

And here is the image scaled 2x (OSX Preview) and then using Waifu2x using high noise reduction (be sure to click the images to see them full size):

And here is another early actor print by Utagawa Kunisada:

(Source Image)

And here is the image scaled 2x (OSX Preview) and then using Waifu2x with low noise reduction and high noise reduction (be sure to click the images to see them full size):

I also have a few blown-up details comparing the results between these three (OSX Preview 2x, Waifu2x with low noise reduction, Waifu2x with high noise reduction):

Head Detail 2x Head Detail Waifu2x Low Head Detail Waifu2x

Arm Detail 2x Arm Detail Waifu2x Low Arm Detail Waifu2x

Cartouche Detail 2x Cartouche Detail Waifu2x Low Cartouche Detail Waifu2x

It’s immediately apparent that the lines are still quite “crisp” in both of the Waifu2x versions. This is extremely compelling as being able to see those details can be quite important. In fact seeing these upscaled images like this is seriously quite impressive. It definitely reinforces the fact that the algorithms Anime training data has suited this subject matter well!

To my eye it almost looks like they entire image has also become more “smooth”, it seems much more mottled, almost as if someone spilled water on it (Japanese prints are printed with watercolors and thus are quite susceptible to water damage that actually creates similar results as to what’s seen here). This is especially true when using Waifu2x’s high noise reduction. It’s not clear that this would become any better with better training data as the original source image really only has so much data to begin with.

Waifu2x’s high noise reduction also causes the background to become much more tumultuous, as if it was crumpled wrapping paper. And the details in the signature, that make it readable, tend to be lost with that much processing. I suspect that using just the low noise reduction may be a better sweet spot.

Helping researchers be able to better see the lines of a print from a tiny image can be a double-edged sword. They will certainly appreciate being able to have a larger and semi-crisp image. However the lack of precision is a massive problem (not that the original image at 2x would’ve been any better). Researchers rely upon being able to spot minute differences between print impressions in order to be able to understand when a print could’ve been created. I suspect that using this technique will need to be reserved to only extremely-small images and come with a massive caveat warning the viewer as to the nature of the image and its appearance.

Regardless, this technology is quite exciting and it’s extremely serendipitous that the subject matter that was used just so happens to correlate nicely with one of my areas of study. I’ll be very curious to see where else people have success with this particular utility and in what context.

Source: John Resig

Leave a Reply

Your email address will not be published.