SeaLLM: Service-Aware and Latency-Optimized Resource Sharing for Large Language Model Inference | Synapse