In order for this to work the person has to speak highly specific phrases in order to generate a voice profile. This is to prevent just such a thing as making a profile of someone else.
Here’s a different app which can clone a voice with just 5 seconds of audio: