by jsendak | Jan 1, 2024 | Computer Science
Expert Commentary: Combinatorial Semantics and the Generation of Argument Patterns in Nominal Phrases
Combinatorial Semantics has emerged as a valuable field in natural language processing, with its contributions being demonstrated through the design of prototypes for automatic generation of argument patterns in nominal phrases in Spanish, French, and German. This paper sheds light on the significance of understanding the syntactic-semantic interface of arguments in foreign languages in a production situation.
The authors begin by providing a comprehensive overview of the design, typology, and information levels of the resources employed in the development of the prototypes (Xera, Combinatoria, and CombiContext). This background information sets the stage for understanding the subsequent discussion on the central role that combinatorial meaning plays in the generation process.
The study emphasizes the importance of semantic filters in the selection, organization, and expansion of the lexicon. These filters serve as crucial components in generating grammatically correct and semantically acceptable mono- and biargumental nominal phrases. By applying these filters, the prototypes ensure that the generated phrases meet the requirements of both syntax and meaning.
One of the key insights offered by this research is the exploration of argument patterns from a syntactic-semantic perspective. By analyzing argument roles and ontological features, the prototypes are able to generate meaningful and coherent phrases. This approach not only considers the syntactic structure of arguments but also takes into account their semantic relationships, resulting in more accurate and contextually appropriate outputs.
Going forward, it would be interesting to see how these prototypes can be further refined and extended to cover a wider range of languages. Additionally, future research could delve into the application of combinatorial semantics in other domains and explore its potential in more complex sentence constructions.
Read the original article
by jsendak | Dec 31, 2023 | Computer Science
Many XR applications require the delivery of volumetric video to users with
six degrees of freedom (6-DoF) movements. Point Cloud has become a popular
volumetric video format. A dense point cloud consumes much higher bandwidth
than a 2D/360 degree video frame. User Field of View (FoV) is more dynamic with
6-DoF movement than 3-DoF movement. To save bandwidth, FoV-adaptive streaming
predicts a user’s FoV and only downloads point cloud data falling in the
predicted FoV. However, it is vulnerable to FoV prediction errors, which can be
significant when a long buffer is utilized for smoothed streaming. In this
work, we propose a multi-round progressive refinement framework for point cloud
video streaming. Instead of sequentially downloading point cloud frames, our
solution simultaneously downloads/patches multiple frames falling into a
sliding time-window, leveraging the inherent scalability of octree-based
point-cloud coding. The optimal rate allocation among all tiles of active
frames are solved analytically using the heterogeneous tile rate-quality
functions calibrated by the predicted user FoV. Multi-frame
downloading/patching simultaneously takes advantage of the streaming smoothness
resulting from long buffer and the FoV prediction accuracy at short buffer
length. We evaluate our streaming solution using simulations driven by real
point cloud videos, real bandwidth traces, and 6-DoF FoV traces of real users.
Our solution is robust against the bandwidth/FoV prediction errors, and can
deliver high and smooth view quality in the face of bandwidth variations and
dynamic user and point cloud movements.
Expert Commentary: The Multi-Disciplinary Nature of Point Cloud Video Streaming
Point cloud video streaming is an important aspect of multimedia information systems, as it enables the delivery of volumetric video with six degrees of freedom (6-DoF) movements to users. This technology is a multi-disciplinary field that combines concepts from animations, artificial reality, augmented reality, and virtual realities.
The article discusses the challenges of delivering point cloud videos, which consume higher bandwidth compared to traditional 2D or 360-degree videos. Additionally, the user’s field of view (FoV) is more dynamic with 6-DoF movement, making it necessary to optimize the streaming process to save bandwidth and provide a high-quality viewing experience.
To address these challenges, the proposed multi-round progressive refinement framework for point cloud video streaming is introduced. This framework simultaneously downloads and patches multiple frames falling into a sliding time-window, leveraging the scalability of octree-based point-cloud coding. By allocating the optimal rate among all tiles of active frames, the solution ensures high-quality viewability based on predicted user FoV.
The multi-disciplinary nature of this framework becomes evident when considering its various components. The use of point cloud videos brings in concepts from animations and 3D modeling, as it requires the representation of objects as a collection of points in 3D space. The integration of artificial reality, augmented reality, and virtual realities is crucial in understanding the user’s dynamic field of view and predicting their FoV accurately for optimized streaming.
From a multimedia information systems perspective, this framework addresses the challenge of delivering volumetric video effectively. Bandwidth efficiency is essential in multimedia systems, especially when dealing with resource-intensive formats like point clouds. By optimizing the rate allocation and leveraging the scalability of octree-based coding, the proposed solution tackles the bandwidth consumption issue and ensures a high-quality viewing experience.
The evaluation of the streaming solution using simulations driven by real point cloud videos, bandwidth traces, and 6-DoF FoV traces of real users demonstrates its robustness against bandwidth and FoV prediction errors. This is significant in the context of multimedia information systems, as it validates the effectiveness of the framework in delivering high and smooth view quality despite variations in bandwidth and dynamic user and point cloud movements.
In conclusion, point cloud video streaming is an area that intersects various disciplines within the field of multimedia information systems. The proposed multi-round progressive refinement framework addresses the challenges of delivering volumetric video with 6-DoF movements by optimizing rate allocation and leveraging octree-based coding. This solution demonstrates the multi-disciplinary nature of point cloud video streaming and its relevance to animations, artificial reality, augmented reality, and virtual realities.
Read the original article
by jsendak | Dec 31, 2023 | Computer Science
Expert Commentary: Overcoming Assumptions in Synthetic Control Methods Using Incentivized Exploration
The use of synthetic control methods (SCMs) has become increasingly prevalent in panel data settings. These methods aim to estimate counterfactual outcomes for test units by leveraging data from donor units that have remained under control. However, a critical assumption in the literature on SCMs is that there is sufficient overlap between the outcomes of the donor units and the test unit in order for accurate counterfactual estimates to be produced.
This assumption, while common, may not always hold in practice. In scenarios where units have agency over their own interventions and different subpopulations have distinct preferences, the outcomes for test units may not lie within the convex hull or linear span of the outcomes for the donor units. This limitation can significantly impact the accuracy and reliability of SCM-based analyses.
Fortunately, a recent study addresses this issue by proposing a novel approach that incentivizes units with different preferences to take interventions they would not typically consider. This method, referred to as incentivized exploration in panel data settings, combines principles from information design and online learning to provide incentive-compatible intervention recommendations to units.
By leveraging this algorithm, researchers can obtain valid counterfactual estimates using SCMs without relying on an explicit overlap assumption on unit outcomes. The proposed approach encourages units to explore interventions beyond their default preferences, ensuring a more comprehensive understanding of the underlying causal effects. This incentivized exploration not only reduces potential biases caused by selection effects but also enhances the generalizability of SCM-based studies.
The implications of this research are substantial. It offers a new perspective on addressing the limitations of SCMs in situations where overlap assumptions do not hold. By expanding the range of interventions considered by units, researchers can gain insights into the causal effects of different policy choices or interventions across a broader spectrum of scenarios.
Moreover, this novel approach opens avenues for future research. As we continue to refine and enhance the incentivized exploration algorithm, it would be valuable to explore its applicability in diverse domains, such as healthcare, economics, and public policy. Additionally, further investigation into the potential trade-offs and constraints associated with incentivizing exploration would provide a more nuanced understanding of the approach’s effectiveness.
In conclusion, this study highlights the importance of addressing assumptions in SCMs and offers a promising solution through incentivized exploration. By incentivizing units with different preferences to explore alternative interventions, researchers can overcome limitations imposed by traditional overlap assumptions. The proposed algorithm provides a valuable tool for obtaining accurate counterfactual estimates in panel data settings and opens doors for future advancements and applications in diverse fields.
Read the original article
by jsendak | Dec 31, 2023 | Computer Science
With the explosive increase of User Generated Content (UGC), UGC video
quality assessment (VQA) becomes more and more important for improving users’
Quality of Experience (QoE). However, most existing UGC VQA studies only focus
on the visual distortions of videos, ignoring that the user’s QoE also depends
on the accompanying audio signals. In this paper, we conduct the first study to
address the problem of UGC audio and video quality assessment (AVQA).
Specifically, we construct the first UGC AVQA database named the SJTU-UAV
database, which includes 520 in-the-wild UGC audio and video (A/V) sequences,
and conduct a user study to obtain the mean opinion scores of the A/V
sequences. The content of the SJTU-UAV database is then analyzed from both the
audio and video aspects to show the database characteristics. We also design a
family of AVQA models, which fuse the popular VQA methods and audio features
via support vector regressor (SVR). We validate the effectiveness of the
proposed models on the three databases. The experimental results show that with
the help of audio signals, the VQA models can evaluate the perceptual quality
more accurately. The database will be released to facilitate further research.
UGC Audio and Video Quality Assessment: A Multi-disciplinary Approach
With the proliferation of User Generated Content (UGC) videos, ensuring high-quality content has become crucial for enhancing users’ Quality of Experience (QoE). While most studies have focused solely on visual distortions in UGC videos, this article presents the first study on audio and video quality assessment (AVQA) for UGC.
The authors of this paper have constructed the SJTU-UAV database, which consists of 520 UGC audio and video sequences captured in real-world settings. They conducted a user study to obtain mean opinion scores for these sequences, allowing for a comprehensive analysis of the database from both audio and video perspectives.
This research is significant in highlighting the multi-disciplinary nature of multimedia information systems by incorporating both visual and audio elements. Traditionally, multimedia systems have primarily focused on visual content, but this study recognizes that the user’s QoE depends not only on what they see but also what they hear.
The article introduces a family of AVQA models that integrate popular Video Quality Assessment (VQA) methods with audio features using support vector regression (SVR). By leveraging audio signals, these models aim to evaluate the perceptual quality more accurately than traditional VQA models.
The field of multimedia information systems encompasses various technologies, including animations, artificial reality, augmented reality, and virtual realities. This study demonstrates how AVQA plays a vital role in enhancing the user’s experience in these domains. As media technologies continue to evolve, incorporating high-quality audio alongside visual elements becomes essential for providing immersive experiences.
The experimental results presented in the paper validate the effectiveness of the proposed AVQA models, showcasing that integrating audio signals improves the accuracy of perceptual quality assessment. This research opens up possibilities for further exploration and development in the field of UGC AVQA.
In conclusion, this study on UGC audio and video quality assessment highlights the importance of considering both visual and audio elements in multimedia systems. By addressing the limitations of existing studies that solely focus on visual distortions, the authors pave the way for more accurate evaluation of the perceptual quality of UGC content. This research contributes to the wider field of multimedia information systems, where the integration of audio and visual elements holds significant potential for enhancing user experiences in animations, artificial reality, augmented reality, and virtual realities.
Read the original article
by jsendak | Dec 31, 2023 | Computer Science
Analysis: Unifying Static and Dynamic Control in ADL Compilation with Piezo
In the field of accelerator design, compilers for accelerator design languages (ADLs) play a crucial role in translating high-level languages into application-specific hardware. These compilers rely on a hardware control interface to compose different hardware units together. Traditionally, there have been two options for this control mechanism: static control and dynamic control.
Static Control
Static control relies on cycle-level timing to coordinate the execution of different hardware units. It is efficient as it eliminates the need for explicit signaling, but it can be brittle and prone to timing issues. Any variation in the timing can lead to failures or incorrect behavior in the hardware design.
Dynamic Control
On the other hand, dynamic control avoids depending on timing details and instead uses explicit signaling to coordinate the behavior of hardware units. This approach is more flexible and less prone to timing-related issues. However, dynamic control introduces additional hardware costs to support compositional reasoning. The explicit signaling mechanisms require extra resources and overhead.
Piezo: Unifying Static and Dynamic Control
Piezo is an ADL compiler that aims to bridge the gap between static and dynamic control in a single intermediate language (IL). It offers a key insight that the static fragment of the IL is a refinement of its dynamic fragment. In other words, code written in the static control style is a subset of the run-time behaviors of its equivalent dynamic code.
This insight allows Piezo to optimize code by combining facts from both static and dynamic submodules. It can leverage information from the static code to make more informed decisions during compilation. Additionally, Piezo opportunistically converts code from dynamic to static control styles where it makes sense. This conversion further enhances the efficiency of the compiled hardware design.
Piezo Implementation
To demonstrate the capabilities of Piezo, the researchers have implemented it as an extension to an existing dynamic ADL compiler named Calyx. They have also developed an MLIR frontend, a systolic array generator, and a packet-scheduling hardware generator using Piezo. These implementations showcase the optimization techniques and highlight the static-dynamic interactions enabled by Piezo.
By unifying static and dynamic control in ADL compilation, Piezo offers a promising approach to improving the efficiency and flexibility of hardware designs. It allows developers to leverage the benefits of both control mechanisms while mitigating their respective drawbacks. The ability to optimize code based on combined static and dynamic analysis opens up new possibilities for achieving high-performance hardware designs.
Read the original article
by jsendak | Dec 31, 2023 | Computer Science
Blind Image Quality Assessment (BIQA) is essential for automatically
evaluating the perceptual quality of visual signals without access to the
references. In this survey, we provide a comprehensive analysis and discussion
of recent developments in the field of BIQA. We have covered various aspects,
including hand-crafted BIQAs that focus on distortion-specific and
general-purpose methods, as well as deep-learned BIQAs that employ supervised
and unsupervised learning techniques. Additionally, we have explored multimodal
quality assessment methods that consider interactions between visual and audio
modalities, as well as visual and text modalities. Finally, we have offered
insights into representative BIQA databases, including both synthetic and
authentic distortions. We believe this survey provides valuable understandings
into the latest developments and emerging trends for the visual quality
community.
Blind Image Quality Assessment: A Comprehensive Analysis and Discussion
Blind Image Quality Assessment (BIQA) plays a crucial role in evaluating the perceptual quality of visual signals without the need for reference images. In this survey, we delve into the various developments in the field of BIQA, providing a thorough analysis and discussion.
Hand-crafted BIQAs: Distortion-specific and General-purpose Methods
One of the key aspects of BIQA is the utilization of hand-crafted methods. These approaches are designed to specifically target certain types of distortions, such as noise, blur, or compression artifacts. By focusing on distortion-specific methods, researchers can develop algorithms with a deep understanding of the underlying artifacts and their impact on image quality.
On the other hand, general-purpose BIQAs aim to assess overall image quality by considering a range of potential distortions. These methods take into account a variety of visual features and statistical measures to estimate the quality of an image. By adopting a more holistic approach, general-purpose methods can provide a comprehensive evaluation of visual signals.
Deep-learned BIQAs: Supervised and Unsupervised Learning Techniques
In recent years, deep learning has emerged as a powerful tool in various fields, including BIQA. Deep-learned BIQAs leverage the capabilities of neural networks to learn complex relationships between image features and perceived quality. These approaches can be categorized into supervised and unsupervised learning techniques.
Supervised learning techniques train the neural networks using large datasets that have been annotated with subjective quality scores. This enables the network to learn from human judgments and make accurate predictions about the quality of unseen images. Unsupervised learning techniques, on the other hand, aim to discover inherent structures and patterns within the data without relying on explicit quality labels.
Multimodal Quality Assessment: Visual, Audio, and Text Modalities
One of the intriguing aspects of BIQA is the exploration of multimodal quality assessment methods. These approaches consider the interactions between different modalities, such as visual, audio, and text, to determine overall quality. By incorporating multiple modalities, researchers can capture a more comprehensive understanding of visual signals and their perceived quality.
Representative BIQA Databases: Synthetic and Authentic Distortions
The availability of high-quality databases is crucial for the advancement of BIQA research. This survey highlights the importance of representative BIQA databases that encompass both synthetic and authentic distortions. Synthetic distortions allow researchers to create controlled environments for testing and evaluating algorithms, while authentic distortions reflect real-world scenarios and challenges.
The Wider Field: Multimedia Information Systems, Animations, Artificial Reality, Augmented Reality, and Virtual Realities
The concepts and developments discussed in this survey have strong connections to the wider field of multimedia information systems. Multimedia information systems deal with the storage, retrieval, and analysis of multimedia data, which includes images, videos, animations, and more.
Moreover, the advancements in BIQA impact various applications that fall under the umbrella of artificial reality. Animations, artificial reality, augmented reality, and virtual realities heavily rely on high-quality visual signals to create immersive and realistic experiences. The ability to automatically assess the quality of these visual signals contributes to enhancing the overall user experience.
In conclusion, this survey provides valuable insights into the latest developments and emerging trends in Blind Image Quality Assessment. By covering various approaches, modalities, and databases, it offers a comprehensive understanding of this multi-disciplinary field. As multimedia information systems continue to evolve and intersect with the realms of artificial and virtual reality, BIQA remains an integral component in ensuring high-quality visual content.
Read the original article