Yeah, that's very slow. It takes about 10 min to build a model for a 1080p test, and reference frame on my 2.6 GHz laptop, which has 6GB memory.
That 16 mm frame is a real challenge. The colors are very different, and it's pretty noisy. Perhaps using a temporal denoiser will reduce the noise artifacts. Also, colors can probably be matched more closely if the 16 mm frame, and 35 mm frame are cropped in the same way.