The way it works is you have to speak specific sentences, and quite a few of them, to even get a close approximation. It’s designed to prevent prerecorded audio from being used (and abused).
Ok I’ll edit matt lanter audio togther to make a perfect recreation