Benchmarking large language model-based agent systems for clinical decision tasks | Synapse