by jsendak | Apr 24, 2025 | Computer Science
arXiv:2504.15376v1 Announce Type: cross
Abstract: We introduce CameraBench, a large-scale dataset and benchmark designed to assess and improve camera motion understanding. CameraBench consists of ~3,000 diverse internet videos, annotated by experts through a rigorous multi-stage quality control process. One of our contributions is a taxonomy of camera motion primitives, designed in collaboration with cinematographers. We find, for example, that some motions like “follow” (or tracking) require understanding scene content like moving subjects. We conduct a large-scale human study to quantify human annotation performance, revealing that domain expertise and tutorial-based training can significantly enhance accuracy. For example, a novice may confuse zoom-in (a change of intrinsics) with translating forward (a change of extrinsics), but can be trained to differentiate the two. Using CameraBench, we evaluate Structure-from-Motion (SfM) and Video-Language Models (VLMs), finding that SfM models struggle to capture semantic primitives that depend on scene content, while VLMs struggle to capture geometric primitives that require precise estimation of trajectories. We then fine-tune a generative VLM on CameraBench to achieve the best of both worlds and showcase its applications, including motion-augmented captioning, video question answering, and video-text retrieval. We hope our taxonomy, benchmark, and tutorials will drive future efforts towards the ultimate goal of understanding camera motions in any video.
CameraBench: A Step Towards Understanding Camera Motion in Videos
In the world of multimedia information systems, understanding camera motion in videos is a crucial task. It has applications in various domains such as animations, artificial reality, augmented reality, and virtual realities. To improve camera motion understanding, a team of researchers has introduced CameraBench, a large-scale dataset and benchmark.
CameraBench comprises approximately 3,000 diverse internet videos that have been annotated by experts using a rigorous multi-stage quality control process. This dataset presents a significant contribution to the field, as it provides a valuable resource for assessing and improving camera motion understanding algorithms.
One key aspect of CameraBench is the collaboration with cinematographers, which has led to the development of a taxonomy of camera motion primitives. This taxonomy helps classify different types of camera motions and their dependencies on scene content. For example, a camera motion like “follow” requires understanding of moving subjects in the scene.
To evaluate human annotation performance, a large-scale human study was conducted. The results showed that domain expertise and tutorial-based training significantly enhance accuracy. Novices may initially struggle with differentiating between camera motions like zoom-in (a change of intrinsics) and translating forward (a change of extrinsics). However, through training, they can learn to differentiate between these motions.
The researchers also evaluated Structure-from-Motion (SfM) models and Video-Language Models (VLMs) using CameraBench. They found that SfM models struggle to capture semantic primitives that depend on scene content, while VLMs struggle with geometric primitives that require precise estimation of trajectories. To address these limitations, a generative VLM was fine-tuned with CameraBench to achieve a hybrid model that combines the strengths of both approaches.
This hybrid model opens up a range of applications, including motion-augmented captioning, video question answering, and video-text retrieval. By better understanding camera motions in videos, these applications can be enhanced, providing more immersive experiences for users.
The taxonomy, benchmark, and tutorials provided with CameraBench are valuable resources for researchers and practitioners working towards the ultimate goal of understanding camera motions in any video. The multi-disciplinary nature of camera motion understanding makes it relevant to various fields, including multimedia information systems, animations, artificial reality, augmented reality, and virtual realities.
Read the original article
by jsendak | Apr 24, 2025 | AI
arXiv:2504.15304v1 Announce Type: new
Abstract: Machine Learning ML agents have been increasingly used in decision-making across a wide range of tasks and environments. These ML agents are typically designed to balance multiple objectives when making choices. Understanding how their decision-making processes align with or diverge from human reasoning is essential. Human agents often encounter hard choices, that is, situations where options are incommensurable; neither option is preferred, yet the agent is not indifferent between them. In such cases, human agents can identify hard choices and resolve them through deliberation. In contrast, current ML agents, due to fundamental limitations in Multi-Objective Optimisation or MOO methods, cannot identify hard choices, let alone resolve them. Neither Scalarised Optimisation nor Pareto Optimisation, the two principal MOO approaches, can capture incommensurability. This limitation generates three distinct alignment problems: the alienness of ML decision-making behaviour from a human perspective; the unreliability of preference-based alignment strategies for hard choices; and the blockage of alignment strategies pursuing multiple objectives. Evaluating two potential technical solutions, I recommend an ensemble solution that appears most promising for enabling ML agents to identify hard choices and mitigate alignment problems. However, no known technique allows ML agents to resolve hard choices through deliberation, as they cannot autonomously change their goals. This underscores the distinctiveness of human agency and urges ML researchers to reconceptualise machine autonomy and develop frameworks and methods that can better address this fundamental gap.
Expert Commentary: Understanding Decision-Making in Machine Learning Agents
Machine Learning (ML) agents have become increasingly prevalent in various decision-making tasks and environments. These agents are designed to balance multiple objectives when making choices, but it is crucial to understand how their decision-making processes align with, or differ from, human reasoning.
In the realm of decision-making, humans often encounter what are known as “hard choices” – situations where options are incommensurable, meaning there is no clear preference or indifference between options. Humans can identify these hard choices and resolve them through deliberation. However, current ML agents, due to limitations in Multi-Objective Optimization (MOO) methods, struggle to identify, let alone resolve, hard choices.
Both Scalarized Optimization and Pareto Optimization, the two main MOO approaches, fail to capture the concept of incommensurability. This limitation gives rise to three significant alignment problems:
- The alienness of ML decision-making behavior from a human perspective
- The unreliability of preference-based alignment strategies for hard choices
- The blockage of alignment strategies pursuing multiple objectives
To address these alignment problems, the article discusses two potential technical solutions. However, it recommends an ensemble solution as the most promising option for enabling ML agents to identify hard choices and mitigate alignment problems. This ensemble solution combines different MOO methods to capture incommensurability and make decision-making more compatible with human reasoning.
While the ensemble solution shows promise in identifying hard choices, it is important to note that no known technique allows ML agents to autonomously change their goals or resolve hard choices through deliberation. This highlights the uniqueness of human agency and prompts ML researchers to rethink the concept of machine autonomy. It calls for the development of frameworks and methods that can bridge this fundamental gap.
The discussion in this article emphasizes the multidisciplinary nature of the concepts explored. It touches upon aspects of decision theory, optimization algorithms, and the philosophy of agency. Understanding and aligning ML decision-making with human reasoning requires insights from multiple fields, demonstrating the need for collaboration and cross-pollination of ideas.
In the future, further research and innovation in MOO methods, the development of novel frameworks, and an interdisciplinary approach will be crucial for bringing ML decision-making closer to human reasoning. By addressing the limitations discussed in this article, we can unlock the full potential of ML agents in various real-world applications, from healthcare to finance and beyond.
Read the original article
by jsendak | Apr 24, 2025 | Art
Thematic Preface:
The Venice Biennale, established in 1895, holds a prestigious position in the realm of contemporary art, attracting millions of visitors and artists from around the world. Celebrating art, culture, and innovation, this biannual event provides a platform for countries to showcase their artistic prowess and engage in global dialogue. In 2026, the responsibility of representing Japan at the Venice Biennale falls upon the esteemed artist Ei Arakawa-Nash, who brings a fresh perspective and bold vision to this esteemed platform.
Art and Japan have long had a deep and intertwined relationship. From ancient masterpieces, such as Hokusai’s “The Great Wave off Kanagawa,” to modern-day influencers like Yayoi Kusama, Japan has consistently produced artists who captivate the world with their distinctive style and cultural significance. Against this backdrop, Arakawa-Nash’s selection carries a weighty symbolism, signaling Japan’s continued artistic evolution and relevance on the global stage.
Arakawa-Nash’s artistic practice is renowned for its multidisciplinarity and its ability to blur the boundaries between conventional mediums. Drawing inspiration from both historical Japanese art and contemporary global trends, Arakawa-Nash’s work embodies a fluidity that mirrors the interconnectedness of our modern world.
This article dives into the fascinating journey that has led Arakawa-Nash to this pivotal moment in their career. We explore their artistic influences, their exploration of cultural identity, and their innovative approaches to art-making. Through this exploration, we gain insights into the broader themes and narratives that Arakawa-Nash may bring to the 2026 Venice Biennale.
As we navigate an increasingly interconnected and complex world, the role of art in fostering dialogue, challenging conventional narratives, and shaping culture is more crucial than ever. Arakawa-Nash’s representation of Japan at the Venice Biennale serves as a powerful reminder of the potential of art to transcend borders and ignite conversations that resonate across diverse cultures and perspectives.
Ei Arakawa-Nash to represent Japan at the 2026 Venice Biennale
Read the original article
by jsendak | Apr 24, 2025 | GR & QC Articles
arXiv:2504.15318v1 Announce Type: new
Abstract: We examine the impact of non-perturbative quantum corrections to the entropy of both charged and charged rotating quasi-topological black holes, with a focus on their thermodynamic properties. The negative-valued correction to the entropy for small black holes is found to be unphysical. Furthermore, we analyze the effect of these non-perturbative corrections on other thermodynamic quantities, including internal energy, Gibbs free energy, charge density, and mass density, for both types of black holes. Our findings indicate that the sign of the correction parameter plays a crucial role at small horizon radii. Additionally, we assess the stability and phase transitions of these black holes in the presence of non-perturbative corrections. Below the critical point, both the corrected and uncorrected specific heat per unit volume are in an unstable regime. This instability leads to a first-order phase transition, wherein the specific heat transitions from negative to positive values as the system reaches a stable state.
Examining Non-Perturbative Quantum Corrections to Black Hole Entropy
We explore the impact of non-perturbative quantum corrections on the entropy of charged and charged rotating quasi-topological black holes. The focus is on understanding the thermodynamic properties of these black holes and the implications of the corrections.
Unphysical Negative-Valued Corrections for Small Black Holes
Our analysis reveals that the non-perturbative correction leads to entropy values that are negative for small black holes. However, these negative values are considered unphysical. This discrepancy raises questions about the validity of the correction for small horizon radii.
Effects on Other Thermodynamic Quantities
In addition to entropy, we investigate the effects of non-perturbative corrections on various thermodynamic quantities such as internal energy, Gibbs free energy, charge density, and mass density. These quantities can provide further insights into the behavior of these black holes.
Significance of Correction Parameter at Small Horizon Radii
Our findings highlight the importance of the sign of the correction parameter for measuring the thermodynamic properties of black holes with small horizon radii. This observation suggests that the correction parameter may play a crucial role in understanding the physics at this scale.
Stability and Phase Transitions
We also assess the stability and phase transitions of these black holes considering the presence of non-perturbative corrections. Our results show that both the corrected and uncorrected specific heat per unit volume are in an unstable regime below the critical point. This instability leads to a first-order phase transition where the specific heat transitions from negative to positive values as the system reaches a stable state.
Roadmap to the Future
While this study provides valuable insights into the effects of non-perturbative quantum corrections on the thermodynamic properties of black holes, there are several challenges and opportunities to be addressed in future research.
Challenges
- Validity of unphysical negative entropy values for small black holes
- Understanding the underlying reasons for the instability of specific heat per unit volume in the unstable regime
- Further investigation into the role of the correction parameter at small horizon radii
Opportunities
- Exploring alternative approaches to account for non-perturbative quantum corrections
- Investigating the implications of these corrections on other black hole properties beyond thermodynamics
- Examining the connection between non-perturbative corrections and quantum gravitational effects
Overall, the study of non-perturbative quantum corrections to black hole thermodynamics opens up new avenues for understanding the fundamental nature of black holes and the interplay between quantum mechanics and gravity. Further research in this area will contribute to a deeper understanding of black hole physics and its theoretical implications.
Read the original article
by jsendak | Apr 24, 2025 | Computer Science
Expert Commentary: Improving Code Editing with EditLord
In software development, code editing is a foundational task that plays a crucial role in ensuring the effectiveness and functionality of the software. The article introduces EditLord, a code editing framework that aims to enhance the performance, robustness, and generalization of code editing procedures.
A key insight presented in EditLord is the use of a language model (LM) as an inductive learner to extract code editing rules from training code pairs. This approach allows for the formulation of concise meta-rule sets that can be utilized for various code editing tasks.
One notable advantage of explicitly defining the code transformation steps is that it addresses the limitations of existing approaches that treat code editing as an implicit end-to-end task. By breaking down the editing process into discrete and explicit steps, EditLord overcomes the challenges related to suboptimal performance and lack of robustness and generalization.
The use of LM models in EditLord offers several benefits. Firstly, it enables the augmentation of training samples through the manifestation of rule sets specific to each sample. This augmentation process can greatly enhance the finetuning process or assist in prompting- and iterative-based code editing. Secondly, by leveraging LM models, EditLord achieves improved editing performance and robustness compared to existing state-of-the-art methods.
Furthermore, EditLord demonstrates its effectiveness across critical software engineering and security applications, LM models, and editing modes. The framework achieves an average improvement of 22.7% in editing performance and 58.1% in robustness. It also ensures a 20.2% higher level of functional correctness, which is crucial in the development of reliable and secure software.
The advancements brought by EditLord have significant implications for the field of code editing and software development as a whole. By explicitly defining code transformation steps and utilizing LM models, developers can benefit from enhanced performance, robustness, generalization, and functional correctness. This can lead to more efficient and reliable software development processes, ultimately resulting in higher-quality software products.
Future Outlook
Looking ahead, the concepts and techniques introduced by EditLord open doors for further research and development in code editing. One possible direction is the exploration of different types of language models and their impact on code editing performance. Additionally, investigating the integration of other machine learning techniques and algorithms with EditLord could yield even more significant improvements.
Moreover, the application of EditLord to specific domains, such as machine learning or cybersecurity, may uncover domain-specific code editing rules and optimizations. This domain-specific approach could further enhance the performance and accuracy of code editing in specialized software development areas.
Overall, EditLord presents a promising framework for code editing, offering a more explicit and robust approach to code transformation. Its adoption has the potential to revolutionize the software development process, leading to higher efficiency, reliability, and security in software creation.
Read the original article