Digital film production technology is rapidly developing towards high resolution, high dynamic range, and wide color gamut, which puts forward more stringent standards for optimizing image quality, requiring simultaneous improvement in processing accuracy, efficiency, and multi scene adaptability. In the past, common post production methods relied heavily on manual color correction and filtering techniques, which not only had relatively limited processing efficiency and were easily influenced by subjective preferences, but also commonly had inconsistent effects between different projection terminals. In response to the above challenges, this study designed a hybrid image quality optimization model called CT-GAN. This model combines the advantages of convolutional neural networks in local feature extraction, the modeling ability of Transformer structures for global contextual information, and the role of generative adversarial networks in improving visual perception quality. The experimental part uses RAW format and high dynamic range materials captured by ARRI Alexa 65 and RED V-Raptor cameras. The results show that the model has a peak signal-to-noise ratio of 35.2 decibels when processing 4K resolution single frame images, which is 23.4% higher than the traditional BM3D method; The structural similarity index reached 0.96, which is 4.3% higher than the classic pix2pix model. In terms of operational efficiency, the single frame processing time is only 28 milliseconds, which can meet the needs of real-time preview. In terms of subjective evaluation, five senior colorists were invited to evaluate, with an average score of 4.6 out of 5. In addition, the model exhibits good picture quality consistency in different projection environments, with picture quality retention rates exceeding 96% from IMAX laser cinemas to standard DCI projection systems.
Weiqing Sun (Thu,) studied this question.