Vox2Face: Speech-Driven Face Generation via Identity-Space Alignment and Diffusion Self-Consistency | Synapse