Inadequacies of Large Language Model Benchmarks in the Era of Generative Artificial Intelligence | Synapse