A moving object, let's say Ben's face, is slightly closer to the camera in one frame compared to the next. Your algorithm goes through and pulls the detail from the previous frame for use in the current one, yet that detail comes from a slightly different depth. You get a picture with both superimposed upon each other, yes it has more micro-detail and in certain applications in specific situations that would be important (like if you want to footage to appear alongside native HD material). But it comes at the cost of the picture depth.

I'm not saying it's a bad method, I'm just saying it comes at the cost.