Efficient Inference of Large Language Models through Model Compression | Synapse