I'm no expert, and surely someone must have thought of this but what about combining to mono, then using a low and a high pass filter - remove anything outside of the range of the voice?
I don't claim to be an expert either, but I've tried that as well. Look at most music on an equalizer of some sort and the most energy is in the mid-range. Infodroid and I had decent results a couple of times by masking the music with different music , but in the same key. Alot of trial and error.