Model Compression and Efficient Inference for Large Language Models: A Survey | Synapse