by jsendak | Apr 10, 2024 | Art
Artes Mundi, a prestigious international art prize, has captured the attention of artists and art enthusiasts worldwide for over two decades. As we mark the start of its 21st year anniversary, the excitement and anticipation surrounding Artes Mundi 11 (AM11) continues to grow.
Since its inception in 2003, Artes Mundi has been a platform for celebrating and promoting contemporary visual artists who address important social and political themes. The prize aims to recognize artists who use their creative expression to shed light on global issues, bringing awareness to the complexities of our modern world.
In an era defined by rapid globalization, environmental challenges, and sociopolitical upheavals, the search for meaning and understanding has never been more crucial. Art has always been a lens through which we interpret and reflect upon the world around us. Artists have the power to provoke, inspire, and challenge conventional thinking.
Throughout history, art has played a pivotal role in documenting and addressing societal changes. From the Italian Renaissance masters capturing the human form with unprecedented realism to the avant-garde movements of the 20th century challenging established norms, art has been a catalyst for societal transformation.
Today, artists continue to push the boundaries of their craft, exploring new mediums and engaging with pressing issues that shape our lives. They provide us with a visual language that transcends barriers and invites dialogue, asking us to question our assumptions and consider alternative viewpoints.
Artes Mundi understands the power of art in confronting these challenges. With each edition of the prize, it shines a spotlight on artists who confront contemporary issues such as inequality, migration, climate change, and identity. Through their work, these artists prompt us to engage critically with these pressing topics.
As we embark on the nomination process for Artes Mundi 11, we invite artists from all corners of the globe to submit their work and contribute to the ongoing dialogue surrounding these crucial issues. AM11 will undoubtedly be a celebration of artistic excellence and intellectual exploration, igniting conversations that transcend borders and forge connections among diverse cultures and perspectives.
Artes Mundi 11 is not only a showcase of the power of art but also a reminder of our collective duty to nurture and support artists who dare to tackle the most pressing challenges of our time. Join us in celebrating the incredible talent that exists in the world and in championing the transformative power of contemporary art.
Marking the start of its 21st year anniversary, Artes Mundi announces the call to nominate artists for Artes Mundi 11 (AM11) from Wednesday April 10 until Friday May 31.
Read the original article
by jsendak | Apr 8, 2024 | Computer Science
arXiv:2404.04037v1 Announce Type: cross
Abstract: We present InstructHumans, a novel framework for instruction-driven 3D human texture editing. Existing text-based editing methods use Score Distillation Sampling (SDS) to distill guidance from generative models. This work shows that naively using such scores is harmful to editing as they destroy consistency with the source avatar. Instead, we propose an alternate SDS for Editing (SDS-E) that selectively incorporates subterms of SDS across diffusion timesteps. We further enhance SDS-E with spatial smoothness regularization and gradient-based viewpoint sampling to achieve high-quality edits with sharp and high-fidelity detailing. InstructHumans significantly outperforms existing 3D editing methods, consistent with the initial avatar while faithful to the textual instructions. Project page: https://jyzhu.top/instruct-humans .
InstructHumans: Enhancing Instruction-driven 3D Human Texture Editing
In the field of multimedia information systems, the concept of instruction-driven 3D human texture editing plays a crucial role in enhancing the visual quality and realism of virtual characters. This emerging area combines elements from multiple disciplines, including animations, artificial reality, augmented reality, and virtual realities.
The article introduces a novel framework called InstructHumans, which aims to improve the process of instruction-driven 3D human texture editing. It addresses the limitations of existing text-based editing methods that use Score Distillation Sampling (SDS) to distill guidance from generative models. The authors argue that relying solely on these scores can harm the editing process by compromising the consistency with the source avatar.
To overcome this challenge, the researchers propose an alternative approach called Score Distillation Sampling for Editing (SDS-E). This method selectively incorporates subterms of SDS across diffusion timesteps, ensuring that edits maintain consistency with the original avatar. Furthermore, SDS-E is enhanced with spatial smoothness regularization and gradient-based viewpoint sampling to achieve high-quality edits with sharp and high-fidelity detailing.
The results of the study demonstrate that InstructHumans outperforms existing 3D editing methods in terms of preserving consistency with the source avatar while faithfully following the given textual instructions. This advancement in the field of instruction-driven 3D human texture editing paves the way for more immersive and realistic virtual experiences.
The significance of this work extends beyond the specific application of 3D human texture editing. By combining insights from animations, artificial reality, augmented reality, and virtual realities, the researchers contribute to the broader field of multimedia information systems. These interdisciplinary collaborations enable the development of more advanced and sophisticated techniques for creating and manipulating virtual content.
In conclusion, the InstructHumans framework represents a valuable contribution to the field of instruction-driven 3D human texture editing. Its novel approach addresses the limitations of existing methods and demonstrates improved consistency and fidelity in edits. This work demonstrates the importance of interdisciplinary collaboration in advancing the field of multimedia information systems and highlights its relevance to the wider domains of animations, artificial reality, augmented reality, and virtual realities.
Read the original article
by jsendak | Mar 28, 2024 | Computer Science
arXiv:2403.17420v1 Announce Type: new
Abstract: The goal of the multi-sound source localization task is to localize sound sources from the mixture individually. While recent multi-sound source localization methods have shown improved performance, they face challenges due to their reliance on prior information about the number of objects to be separated. In this paper, to overcome this limitation, we present a novel multi-sound source localization method that can perform localization without prior knowledge of the number of sound sources. To achieve this goal, we propose an iterative object identification (IOI) module, which can recognize sound-making objects in an iterative manner. After finding the regions of sound-making objects, we devise object similarity-aware clustering (OSC) loss to guide the IOI module to effectively combine regions of the same object but also distinguish between different objects and backgrounds. It enables our method to perform accurate localization of sound-making objects without any prior knowledge. Extensive experimental results on the MUSIC and VGGSound benchmarks show the significant performance improvements of the proposed method over the existing methods for both single and multi-source. Our code is available at: https://github.com/VisualAIKHU/NoPrior_MultiSSL
Expert Commentary: Advancements in Multi-Sound Source Localization
Multi-sound source localization is a crucial task in the field of multimedia information systems, as it enables the identification and localization of sound sources in a given environment. The ability to accurately localize sound sources has wide-ranging applications, including audio scene analysis, surveillance systems, and virtual reality experiences.
The mentioned article introduces a novel method for multi-sound source localization that overcomes the limitation of requiring prior knowledge about the number of sound sources to be separated. This is a significant advancement, as it allows for more flexible and adaptable localization in real-world scenarios where prior information is often unavailable.
One notable feature of the proposed method is the iterative object identification (IOI) module. This module leverages an iterative approach to identify sound-making objects in the mixture. By iteratively refining the object identification process, the method can improve the accuracy of localization without the need for prior knowledge. This iterative approach is a testament to the multi-disciplinary nature of this research, combining concepts from signal processing, machine learning, and computer vision.
To further enhance the accuracy of localization, the authors introduce the object similarity-aware clustering (OSC) loss. This loss function guides the IOI module to effectively combine regions of the same object while also distinguishing between different objects and backgrounds. By incorporating object similarity awareness into the clustering process, the proposed method achieves better discrimination and localization performance.
The experimental results on the MUSIC and VGGSound benchmarks demonstrate the significant performance improvements of the proposed method over existing methods for both single and multi-source localization. This suggests that the method can accurately identify and localize sound sources in various scenarios, making it suitable for real-world applications.
In the wider field of multimedia information systems, the advancements in multi-sound source localization have implications for the fields of animations, artificial reality, augmented reality, and virtual realities. Accurate localization of sound sources in these contexts can greatly enhance the immersive experiences and realism of multimedia content. For example, in virtual reality applications, precise localization of virtual sound sources can create a more realistic and engrossing environment for users.
In conclusion, the proposed method for multi-sound source localization without prior knowledge in the mentioned article showcases the continual progress in the field of multimedia information systems. The multi-disciplinary nature of this research, alongside the significant performance improvements, paves the way for enhanced multimedia experiences in various domains, including animations, artificial reality, augmented reality, and virtual realities.
Read the original article
by jsendak | Mar 22, 2024 | Art
The Future of Documentary Filmmaking
In a recent interview, Oscar-winning director, (Director’s Name), reflects on the life of the Russian opposition leader and the mission of documentary filmmaking. This conversation sparks a deeper contemplation on the potential future trends in this industry. With advancements in technology and evolving viewer preferences, documentary filmmaking is set to undergo a significant transformation.
Persistence of Realism
One key point highlighted by (Director’s Name) is the importance of realism in documentary filmmaking. As audiences become increasingly discerning, the demand for authentic and unbiased storytelling is on the rise. This trend is likely to continue, with viewers seeking documentaries that provide genuine insights into real people and events.
To cater to this demand, future documentaries will need to focus on capturing unfiltered moments and candid interviews, delving deep into the stories that shape our world. Filmmakers will need to employ innovative techniques to capture the rawness of reality, enabling viewers to connect with the subject matter on a deeply emotional level.
The Impact of Technology
The technological advancements in recent years have transformed the documentary filmmaking landscape. With the availability of high-quality cameras and editing tools, filmmakers now have greater freedom to experiment with different styles and techniques. This has opened up new avenues of storytelling and enhanced the overall visual appeal of documentaries.
In the future, emerging technologies such as virtual reality (VR) and augmented reality (AR) are likely to play a significant role in documentary filmmaking. These immersive technologies offer a unique and engaging way to experience real-life stories, allowing viewers to become active participants in the narrative. Filmmakers will need to adapt to these new mediums and develop innovative storytelling techniques to make the most of these advances.
The Rise of Interactive Storytelling
Another potential trend in the documentary industry is the rise of interactive storytelling. With the advent of online platforms and streaming services, filmmakers now have the opportunity to engage viewers in new and exciting ways. Interactive documentaries, such as those incorporating elements of gamification or allowing viewers to choose their own narrative paths, have already started gaining traction.
In the future, interactive storytelling will become more prevalent, blurring the lines between filmmaker and viewer. This trend will allow audiences to actively participate in the documentary, shaping the outcome based on their decisions. This form of collaborative storytelling has the potential to create a more immersive and personalized experience for viewers.
Recommendations for the Industry
- Embrace new technology: Filmmakers should actively explore and adopt emerging technologies such as VR and AR to enhance the immersive experience of their documentaries. By pushing the boundaries of traditional storytelling, they can captivate audiences in new and exciting ways.
- Prioritize authenticity: In an era of misinformation, it is crucial for documentary filmmakers to maintain a commitment to authenticity and unbiased storytelling. This means conducting thorough research, fact-checking, and presenting multiple perspectives to provide viewers with a well-rounded understanding of the subject matter.
- Collaborate and experiment: The future of documentary filmmaking lies in collaboration and experimentation. Filmmakers should collaborate with experts from diverse fields, such as technology and interactive design, to push the boundaries of storytelling. Experimenting with different formats, styles, and platforms will help filmmakers stay relevant and captivate ever-evolving audiences.
In conclusion, the future of documentary filmmaking is bright and promising. The persisting demand for authenticity, coupled with advancements in technology and interactive storytelling, will reshape the industry. By embracing new technologies, prioritizing authenticity, and collaborating with various experts, documentary filmmakers can create compelling narratives that resonate deeply with audiences.
References:
- Provide a list of relevant references in APA format.
by jsendak | Mar 11, 2024 | AI
arXiv:2403.05029v1 Announce Type: new Abstract: Traffic prediction is one of the most significant foundations in Intelligent Transportation Systems (ITS). Traditional traffic prediction methods rely only on historical traffic data to predict traffic trends and face two main challenges. 1) insensitivity to unusual events. 2) limited performance in long-term prediction. In this work, we explore how generative models combined with text describing the traffic system can be applied for traffic generation, and name the task Text-to-Traffic Generation (TTG). The key challenge of the TTG task is how to associate text with the spatial structure of the road network and traffic data for generating traffic situations. To this end, we propose ChatTraffic, the first diffusion model for text-to-traffic generation. To guarantee the consistency between synthetic and real data, we augment a diffusion model with the Graph Convolutional Network (GCN) to extract spatial correlations of traffic data. In addition, we construct a large dataset containing text-traffic pairs for the TTG task. We benchmarked our model qualitatively and quantitatively on the released dataset. The experimental results indicate that ChatTraffic can generate realistic traffic situations from the text. Our code and dataset are available at https://github.com/ChyaZhang/ChatTraffic.
The article “Text-to-Traffic Generation: A Diffusion Model Approach” addresses the challenges faced by traditional traffic prediction methods and introduces a novel approach called Text-to-Traffic Generation (TTG). The TTG task aims to generate traffic situations by combining generative models with text descriptions of the traffic system. The key challenge lies in associating text with the spatial structure of the road network and traffic data. The authors propose ChatTraffic, the first diffusion model for text-to-traffic generation, which incorporates a Graph Convolutional Network (GCN) to extract spatial correlations. They also construct a large dataset of text-traffic pairs for benchmarking purposes. The experimental results demonstrate that ChatTraffic can generate realistic traffic situations from text descriptions. The code and dataset for this model are publicly available.
Traffic Prediction and Text-to-Traffic Generation: Paving the Way for Intelligent Transportation Systems
Intelligent Transportation Systems (ITS) have become an integral part of modern urban infrastructure, aiming to enhance traffic management and efficiency. One of the foundational pillars of ITS is traffic prediction, which enables authorities to anticipate traffic trends and plan proactive measures to alleviate congestion. However, traditional traffic prediction methods have their limitations, mainly due to their reliance solely on historical traffic data. This article explores a novel approach to traffic prediction by combining generative models with text descriptions of the traffic system, introducing the concept of Text-to-Traffic Generation (TTG).
The Challenges of Traditional Traffic Prediction
Traditional traffic prediction methods face two significant challenges. Firstly, they tend to be insensitive to unusual events such as accidents or major construction, which can lead to unpredictable traffic patterns. Secondly, these methods often display limited performance in long-term prediction, struggling to capture complex and evolving traffic dynamics. Addressing these challenges is crucial to developing more accurate and reliable traffic prediction models.
The Emergence of Text-to-Traffic Generation
Text-to-Traffic Generation (TTG) offers a fresh perspective on traffic prediction by incorporating textual information along with historical traffic data. The key challenge of the TTG task lies in effectively associating text descriptions with the spatial structure of the road network and traffic data to generate realistic traffic situations. In response to this, researchers have introduced ChatTraffic, the first diffusion model designed specifically for text-to-traffic generation.
The Role of ChatTraffic in Traffic Generation
ChatTraffic utilizes a diffusion model augmented with Graph Convolutional Network (GCN) to extract spatial correlations from traffic data. By incorporating text descriptions, ChatTraffic ensures consistency between synthetic and real data, improving the reliability of traffic generation. The model leverages a large dataset containing text-traffic pairs specifically constructed for the TTG task.
Benchmarking ChatTraffic: Qualitative and Quantitative Evaluation
To evaluate the performance of ChatTraffic, the model has been benchmarked both qualitatively and quantitatively using the released dataset. The experimental results demonstrate that ChatTraffic is capable of generating realistic traffic situations based on textual inputs. This breakthrough in traffic generation opens up new possibilities for forecasting traffic patterns with greater accuracy and capturing the effects of unusual events on traffic dynamics.
The Road Ahead
The introduction of Text-to-Traffic Generation (TTG) through models like ChatTraffic showcases the potential of leveraging textual context to enhance traffic prediction. As research advances in this field, further improvements and innovations can be expected, leading to more efficient traffic management and intelligent transportation systems. The availability of the ChatTraffic code and dataset on GitHub (https://github.com/ChyaZhang/ChatTraffic) enables the wider research community to explore and contribute to this exciting development.
The paper introduces a novel approach called ChatTraffic, which combines generative models with text descriptions of the traffic system to generate realistic traffic situations. This task, referred to as Text-to-Traffic Generation (TTG), aims to address the limitations of traditional traffic prediction methods that rely solely on historical traffic data.
One of the key challenges in the TTG task is how to associate the textual information with the spatial structure of the road network and traffic data. To overcome this challenge, the authors propose augmenting a diffusion model with a Graph Convolutional Network (GCN) to extract spatial correlations from the traffic data. This allows the generated traffic situations to be consistent with real data.
To evaluate the performance of ChatTraffic, the authors construct a large dataset containing text-traffic pairs specifically for the TTG task. They then benchmark their model both qualitatively and quantitatively using this dataset. The experimental results demonstrate that ChatTraffic is capable of generating realistic traffic situations from the provided text descriptions.
This research has significant implications for Intelligent Transportation Systems (ITS) as it offers a new approach to traffic prediction that overcomes the challenges of insensitivity to unusual events and limited long-term prediction performance. By incorporating text descriptions, ChatTraffic has the potential to improve the accuracy and reliability of traffic prediction models.
Moving forward, it would be interesting to see further advancements in this field. For instance, exploring the use of more advanced generative models, such as Variational Autoencoders (VAEs) or Generative Adversarial Networks (GANs), could potentially enhance the realism of the generated traffic situations. Additionally, incorporating real-time data sources, such as social media feeds or weather information, could further improve the predictive capabilities of ChatTraffic by capturing dynamic factors that influence traffic patterns.
Read the original article