Enhancing Inference Efficiency of Large Language Models: Investigating Optimization Strategies and Architectural Innovations | Synapse