G-Refine: A General Quality Refiner for Text-to-Image Generation

arXiv:2404.18343v1 Announce Type: new Abstract: With the evolution of Text-to-Image (T2I) models, the quality defects of AI-Generated Images (AIGIs) pose a significant barrier to their widespread adoption. In terms of both perception and alignment, existing models cannot always guarantee high-quality results. To mitigate this limitation, we introduce G-Refine, a general image quality refiner designed to enhance low-quality images without compromising the integrity of high-quality ones. The model is composed of three interconnected modules: a perception quality indicator, an alignment quality indicator, and a general quality enhancement module. Based on the mechanisms of the Human Visual System (HVS) and syntax trees, the first two indicators can respectively identify the perception and alignment deficiencies, and the last module can apply targeted quality enhancement accordingly. Extensive experimentation reveals that when compared to alternative optimization methods, AIGIs after G-Refine outperform in 10+ quality metrics across 4 databases. This improvement significantly contributes to the practical application of contemporary T2I models, paving the way for their broader adoption. The code will be released on https://github.com/Q-Future/Q-Refine.
The article “G-Refine: Enhancing the Quality of AI-Generated Images with a General Image Quality Refiner” addresses the limitations of Text-to-Image (T2I) models due to the quality defects of AI-Generated Images (AIGIs). These defects hinder the widespread adoption of such models. Existing models fail to consistently produce high-quality results in terms of perception and alignment. To overcome this limitation, the authors introduce G-Refine, a general image quality refiner that enhances low-quality images without compromising high-quality ones. G-Refine consists of three interconnected modules: a perception quality indicator, an alignment quality indicator, and a general quality enhancement module. These modules leverage the mechanisms of the Human Visual System (HVS) and syntax trees to identify perception and alignment deficiencies and apply targeted quality enhancement. Extensive experimentation demonstrates that AIGIs refined by G-Refine outperform alternative optimization methods in more than 10 quality metrics across four databases. This improvement significantly contributes to the practical application of contemporary T2I models, opening the doors for their broader adoption. The code for G-Refine will be made available on the GitHub repository: https://github.com/Q-Future/Q-Refine.

G-Refine: Enhancing the Quality of AI-Generated Images

With the evolution of Text-to-Image (T2I) models, the quality defects of AI-Generated Images (AIGIs) pose a significant barrier
to their widespread adoption. In terms of both perception and alignment, existing models cannot always guarantee
high-quality results.

To mitigate this limitation, we introduce G-Refine, a general image quality refiner designed to enhance low-quality images
without compromising the integrity of high-quality ones. The model is composed of three interconnected modules:
a perception quality indicator, an alignment quality indicator, and a general quality enhancement module. Based
on the mechanisms of the Human Visual System (HVS) and syntax trees, the first two indicators can respectively
identify the perception and alignment deficiencies, and the last module can apply targeted quality enhancement
accordingly.

Extensive experimentation reveals that when compared to alternative optimization methods, AIGIs after G-Refine outperform
in 10+ quality metrics across 4 databases. This improvement significantly contributes to the practical application
of contemporary T2I models, paving the way for their broader adoption.

The code for G-Refine can be found on GitHub. Feel free to
explore and utilize it for enhancing the quality of your AI-Generated Images.

The paper titled “G-Refine: Enhancing Text-to-Image Models with General Image Quality Refinement” addresses a crucial issue in the field of Text-to-Image (T2I) models – the quality defects of AI-Generated Images (AIGIs). The authors highlight that existing models often fail to produce high-quality results consistently, both in terms of perception and alignment. This limitation has hindered the widespread adoption of T2I models.

To overcome this challenge, the authors propose G-Refine, a general image quality refiner designed to enhance low-quality images while maintaining the integrity of high-quality ones. G-Refine consists of three interconnected modules: a perception quality indicator, an alignment quality indicator, and a general quality enhancement module.

The perception quality indicator leverages the mechanisms of the Human Visual System (HVS) to identify perception deficiencies in the AI-generated images. This module aims to capture discrepancies between the generated images and how humans perceive them. By doing so, it can pinpoint areas where the generated images fall short in terms of visual quality.

The alignment quality indicator, on the other hand, utilizes syntax trees to identify alignment deficiencies in the generated images. This module focuses on ensuring that the generated images accurately align with the given textual descriptions. By analyzing the syntactic structure of the text and comparing it with the image, it can identify areas where the alignment is subpar.

Finally, the general quality enhancement module takes the outputs from the perception and alignment quality indicators and applies targeted quality enhancement techniques. This module leverages the identified deficiencies to refine the low-quality areas of the generated images while preserving the integrity of high-quality areas.

The authors conducted extensive experimentation to evaluate the effectiveness of G-Refine. They compared the performance of AIGIs after refinement with alternative optimization methods across four databases and more than ten quality metrics. The results showed that AIGIs refined using G-Refine outperformed the alternatives in all metrics, indicating a significant improvement in image quality. This improvement is crucial for the practical application of contemporary T2I models, as it paves the way for their broader adoption.

The authors also announced that the code for G-Refine will be made available on GitHub, specifically at https://github.com/Q-Future/Q-Refine. This release will facilitate further research and development in the field, allowing other researchers and practitioners to build upon the proposed method and potentially enhance it.

In summary, the introduction of G-Refine addresses an important challenge in the field of T2I models by improving the quality of AI-generated images. By leveraging perception and alignment quality indicators, as well as a general quality enhancement module, G-Refine demonstrates superior performance compared to alternative optimization methods. This advancement holds promise for the practical application and wider adoption of T2I models, ultimately benefiting various domains such as creative design, virtual reality, and content generation.
Read the original article

G-Refine: A General Quality Refiner for Text-to-Image Generation

G-Refine: Enhancing the Quality of AI-Generated Images

Submit a Comment Cancel reply

Recent Posts

Recent Comments