Soften the focus a little as you move deeper into the BG on that footage, and I think you might improve it.
I'd disagree - wouldn't less depth of field make it look more like a model? I can't find a link, but there's this guy who takes photos of cities with a tilt-and-shift lens that dramatically lessens the depth of field, and if you don't know, you'd swear they were fantastically detailed but very small miniatures.
ETA: a bit like this:

DE