I know, nobody wants to hear, but…
Just to assure that I am no newbie in all this, I do film restoring (editing, color correcting, encoding a.s.o) professionally since almost 20 years.
As all of you know, there is no real way to upscale anything really (except for vector graphics…), because you cannot generate any detail that isn’t there, out of nothing. There are many cases where there ARE more details, but not obviously visible, where you can do a nice job with the right sharpen algorithms, but also this is no “real” upscaling. Clear.
This was the situation, until somebody had the idea to use AI…
Due to rather quick inventions and later “central” implementations, a lot of developers presented a lot of software, that at the end of the day mostly used the same algorithms, and this looked very, very intersting in the beginning. There were and are samples over samples that look unbelievably good, quite magic.
So I tested everything I could get - whenever I found the time for it - I thought: First see, then think. If it worked really well, the algorithms are not worth to think about, at least for me, who did not intend to take part in this by coding, just as a user.
The first upscale I did (after a lot of testing I decided to use Gigapixel for it) was for a reproduction of an old painting, to get a more detailled printing master. This worked incredibly well! In THIS case…
It worked, because the brushstrokes are some kind of pattern that made it easy for Gigapixel to find SIMILAR PATTERNS in its database. And this is what it’s all about in the end, and with all machine learning:
To find a similar pattern that you can use for details! But…
This finding similarities is done by accident! Only similarity counts! That means, that if the structure of a distant grass valley for some reasons looks like nearby green coloured hair, then the machine learning algorithm MAY decide to use this structure and add details to the grass from somebody’s green hair.
You MAY not notice this in a picture - sometimes you don’t, somtimes you do, accidentally.
But in a video you always notice it, because the used patterns CHANGE all the time. That means, in one frame AI uses different patterns than in the next ones. So this generates a lot of noise, unsteadiness, flicker a.s.o.
I made a lot of tests with it, wrote some not too bad avisynth-scripts to compensate these effects - not useless, but also not convincing in the end. No real upscaling possible, least of all upscaling old, not so good material, which would have been a great goal!
The newer thing about real video-AI-upscaling then was that they tried to use
-frames from environmental frames of the same scene as part of the database, a wise decision to find really similar structures, but at the cost of not finding more sharpness in most cases…
-implement more steadiness in the selected patterns
So I tested f. e. Gigapixel’s video-tool that one mentioned above.
As expected the winning of details was MUCH less than with image-upscaling. Comparable to a rather medium conventional-“Upscale” quality.
So I began to think about the situation. My opinion: The problem with all this is:
Added details are still added by simple SIMILARITY, regardless of what origin they have!
So this all is still no real AI, that would work similar to a human restorer. When a restorer adds a lost detail by f. e. restoring a painting, he knows WHAT he restores! F. e. some grass in the distance…
So, what is needed in the future is, to build real HUGE nested databases of all the things you can see out there,
with all nesting, linking, rating, comparing, a. s. o. that a human brain does. So that an algorithm in the end will KNOW that it has to take the RIGHT pattern, and not something that looks just similar.
Until then we will sure ecperience some improvements - but if you really look at close range, it will always have some nature of Frankenstein.