Sign In

FrankB

User Group
Members
Join date
12-Jan-2017
Last activity
24-Feb-2020
Posts
10

Post History

Post
#1325375
Topic
Gigapixel AI vs. infognition Super Resolution or What to use to upscale SD to HD or 4K
Time

phoenixobia said:
Interesting. Makes sense. I just don’t know scripting and how to learn to do it. But I know it’s great.

-Install AviSynth (for better compatibility first 32Bits)
-Write a script (you can do it with notepad) containing an Input-Source and a return, f. e.:

v=avisource(“D:\Videos\1.avi”)
return v

Save it with extension AVS.
Every video app that opens the avi via the installed handler correctly opens this avs now as a video, VirtualDub2 recommended. That’s all. When you write something between the two lines it will become interesting. There are also many more “source”-filters that can open nearly any possible format. I would recommend the package LSMASHsource, that comes with LWLIBAVVIDEOSOURCE. Opening with this can last a while, because it writes an index, but will afterwards be fast. And it handles sources very much correctly without changing anything. There are more, also based on ffmpeg. Between the lines you can do almost everything with the video, more than you ever dreamt about…

But the big downside is the frame rate when it comes to 24fps film, that shift of speed and change of pitch in the audio after conversion to 25fps is so noticeable and annoying to me. I know how to convert a PAL source back to progressive 23.976fps and change the audio speed to match the new framerate to reverse this effect but when audio speed becomes slower, audio quality decreases to some extent because it goes from high pitch to low pitch.

Yes, you are right. Especially a problem when you release something on BluRay with correct 23.976 or 24 fps, but the dubbing had been made for PAL-TV. So you hve to slow sound down in speed. The quarter tone lower in my opinion is not as hearable or annoying as the lower speed. One can correct the pitch, but this does not solve the problem. You can only release the BluRay in 25fps, but this causes other problems and the picture remains too fast…
In spite of these problems I highly prefer PAL, when I look at Pulldown and IVTC issues.

film --> scanned with pulldown (telecined) --> stored really uncompressed or losslessly compressed --> IVTCed
would give you 100% progressive frames back, that’s right.

Yes, Laserdisc is uncompressed because it’s analog, correct?

Yes, but the problem I mentioned was not the Laserdisc or your transfer to harddisk, but the possible steps that happened before to the material from scanning the progressive zelluloid-film until it had been copied to a Laserdisc. But I fear we are somehow off-topic meanwhile…

Post
#1325232
Topic
Gigapixel AI vs. infognition Super Resolution or What to use to upscale SD to HD or 4K
Time

phoenixobia said:

As for IVTC, I haven’t tried Avisynth yet but I’ve read about it and would try it. I am aware of the change of pattern that can happen but I think that could be the case with any algorithm. As you mentioned, doing those by hand to produce progressive results is the best way to have perfect results but I don’t see how it can be fast. Shouldn’t that take forever? Please elaborate on that.

The speed of a hand-made IVTC in avisynth (using SelectEvery) is lightning fast. But I think you meant to produce the script. The script that handles the different patterns by hand can in most cases (>99%) be made in about 15-30 minutes. There are only three patterns in 99,5% of all cases, the only exceptions of this I had with Japanese NTSC-sources so far, they sometimes have a different way of pulldown, but also not often.
So, you just have to specify those three patterns, handle them with variables, and put all together with Trim’s.
A question of how exact you want to have your result. In a professional setting, where you have to do IVTC for series with >100 parts, you sure wouldn’t do this by hand, but if you do something with love - as everybody here does, I think - and willing to spend some time more to achieve the very best result, it’s no question, is it?
I can only encourage you and everyone to try avisynth. It’s the most flexible thing to handle video out there, and you have FULL control of what’s happening.

Second, I’m not sure I understand the part you talk about telecining while scanning.
If the original footage has been 24fps film/animation and telecined (3:2 pulldown) to 29.97fps which is the case with Laserdiscs, IVTC is the process to get rid of those added frames and turn it to 23.976fps. The results should have no jagged edges or as you say staircase-artefacts and no half resolution. My IVTC video has no rough edges if that’s what you mean, so please explain this.

This ugly pulldown-thing that you have with NTSC (I am lucky to live in PAL region) is produced in different ways. Sometimes while scanning as one process (older sources), or later. But this is not the point of what I meant.
The point is compressing while interlaced. Maybe this Laserdisc-source comes from an older scan. At a very early “stage” it has been copied to a - let’s say - DigiBeta-cassette that then was archived. DigiBeta-format is quite good, but compressed, not much, but lossyly compressed. If the source copied to a DigiBeta is progressive, you won’t notice the compression with your eyes, no chance. But a pulldowned source is combed… The most ugly thing with pulldown is, that in almost any case you do not simply add frames by doubling every fourth, no - it’s fields that are added - as you know of course - that results in combing. Lossy compressions - reagrdless of how good they are - do harm in these cases. Let me specify:

film --> scanned with pulldown (telecined) --> stored really uncompressed or losslessly compressed --> IVTCed
would give you 100% progressive frames back, that’s right.

film --> scanned with pulldown --> maybe once stored uncompresed --> copied to and archived as DigiBeta --> even copied to another medium via SDI or similar interfaces --> IVTCed
will result in small jagged edges (thanks for the term), that increase with sharpening, even with AI. There is no lossy compression algorithm that does not produce ANY edges when handling combed material. You won’t notice it in most cases, but you will see it, if you sharpen, and that’s what is done here.
The later the Pulldown happens, the higher are chances to get more or less artefactless original progressive frames back.

Below are the images. It’s best to download and see at 100% but you can still see the difference here.

These look damned good! One could critisize many things, but to my feeling they really look good. I only doubt that this couldn’t have been achieved also with more conventional things than AI - and I doubt that GP doesn’t take advantage of these. 😉

Post
#1325185
Topic
Gigapixel AI vs. infognition Super Resolution or What to use to upscale SD to HD or 4K
Time

Sounds as if you tried a lot not to lose anything around upscaling. Really good is, that you exported from GPVE to lossless images, to collect these together afterwards. How well does the heart of this, GP Video Enhance, seem to work for you? I found it rather weak. Can you post some screens?

May I admit some things? I feel free to do so, maybe I can help:

For IVTC, one of the key points in your workflow, there are much better ways. Most avisynth-experts use TIVTC which produces excellent results without any loss in quality if you use avisynth correctly. Even better is to IVTC by hand with avisynth (even if the doom-cracks don’t like this… 😉 ) Very often you have only one, or less than 4 or 5 pattern-changings to remove, which is by hand done quite fast, and you get 100% jitter-free results, which NO automatic algorithm can achieve.
Also, unfortunately, if telecining (pulldown) had been done directly while scanning and later different compressions were made, there often remain “staircase”-artefacts (don’t know the right term in English) after IVTC. Hard to correct these without losing resolution, but if you plan to upscale/sharpen afterwards, often better to remove it at the cost of a bit less resolution, which you “get back” (not really) with your upscale. Otherwise these staircases will be more and more visible. If you have a source with absolutely NO such artefacts after IVTC, you are lucky.

At the end I would rather export to some lossless codec, then it is possible to later improve something, edit something a. s.o. But maybe you just didn’t mention it.

Post
#1325101
Topic
Gigapixel AI vs. infognition Super Resolution or What to use to upscale SD to HD or 4K
Time

I know, nobody wants to hear, but…
Just to assure that I am no newbie in all this, I do film restoring (editing, color correcting, encoding a.s.o) professionally since almost 20 years.
As all of you know, there is no real way to upscale anything really (except for vector graphics…), because you cannot generate any detail that isn’t there, out of nothing. There are many cases where there ARE more details, but not obviously visible, where you can do a nice job with the right sharpen algorithms, but also this is no “real” upscaling. Clear.

This was the situation, until somebody had the idea to use AI…
Due to rather quick inventions and later “central” implementations, a lot of developers presented a lot of software, that at the end of the day mostly used the same algorithms, and this looked very, very intersting in the beginning. There were and are samples over samples that look unbelievably good, quite magic.
So I tested everything I could get - whenever I found the time for it - I thought: First see, then think. If it worked really well, the algorithms are not worth to think about, at least for me, who did not intend to take part in this by coding, just as a user.

The first upscale I did (after a lot of testing I decided to use Gigapixel for it) was for a reproduction of an old painting, to get a more detailled printing master. This worked incredibly well! In THIS case…
It worked, because the brushstrokes are some kind of pattern that made it easy for Gigapixel to find SIMILAR PATTERNS in its database. And this is what it’s all about in the end, and with all machine learning:

To find a similar pattern that you can use for details! But…

This finding similarities is done by accident! Only similarity counts! That means, that if the structure of a distant grass valley for some reasons looks like nearby green coloured hair, then the machine learning algorithm MAY decide to use this structure and add details to the grass from somebody’s green hair.
You MAY not notice this in a picture - sometimes you don’t, somtimes you do, accidentally.
But in a video you always notice it, because the used patterns CHANGE all the time. That means, in one frame AI uses different patterns than in the next ones. So this generates a lot of noise, unsteadiness, flicker a.s.o.
I made a lot of tests with it, wrote some not too bad avisynth-scripts to compensate these effects - not useless, but also not convincing in the end. No real upscaling possible, least of all upscaling old, not so good material, which would have been a great goal!

The newer thing about real video-AI-upscaling then was that they tried to use
-frames from environmental frames of the same scene as part of the database, a wise decision to find really similar structures, but at the cost of not finding more sharpness in most cases…
-implement more steadiness in the selected patterns
So I tested f. e. Gigapixel’s video-tool that one mentioned above.
As expected the winning of details was MUCH less than with image-upscaling. Comparable to a rather medium conventional-“Upscale” quality.

So I began to think about the situation. My opinion: The problem with all this is:
Added details are still added by simple SIMILARITY, regardless of what origin they have!
So this all is still no real AI, that would work similar to a human restorer. When a restorer adds a lost detail by f. e. restoring a painting, he knows WHAT he restores! F. e. some grass in the distance…
So, what is needed in the future is, to build real HUGE nested databases of all the things you can see out there,
with all nesting, linking, rating, comparing, a. s. o. that a human brain does. So that an algorithm in the end will KNOW that it has to take the RIGHT pattern, and not something that looks just similar.
Until then we will sure ecperience some improvements - but if you really look at close range, it will always have some nature of Frankenstein.