A large-scale benchmark for evaluating large language models on medical question answering in Romanian | Synapse