DHP Benchmark: Are LLMs Good NLG Evaluators? | Synapse