Week Ending 2.18.2024

 

RESEARCH WATCH: 2.18.2024

 

3D Diffuser Actor: Policy Diffusion with 3D Scene Representations

The paper introduces a new method to improve robot manipulation using conditional diffusion models and 3D scene representations. This could allow robots to better follow natural language instructions in real-world settings.

Authors:  Tsung-Wei Ke, Nikolaos Gkanatsios, Katerina Fragkiadaki

Link:  https://arxiv.org/abs/2402.10885v1

Date: 2024-02-16

Summary:

We marry diffusion policies and 3D scene representations for robot manipulation. Diffusion policies learn the action distribution conditioned on the robot and environment state using conditional diffusion models. They have recently shown to outperform both deterministic and alternative state-conditioned action distribution learning methods. 3D robot policies use 3D scene feature representations aggregated from a single or multiple camera views using sensed depth. They have shown to generalize better than their 2D counterparts across camera viewpoints. We unify these two lines of work and present 3D Diffuser Actor, a neural policy architecture that, given a language instruction, builds a 3D representation of the visual scene and conditions on it to iteratively denoise 3D rotations and translations for the robot's end-effector. At each denoising iteration, our model represents end-effector pose estimates as 3D scene tokens and predicts the 3D translation and rotation error for each of them, by featurizing them using 3D relative attention to other 3D visual and language tokens. 3D Diffuser Actor sets a new state-of-the-art on RLBench with an absolute performance gain of 16.3% over the current SOTA on a multi-view setup and an absolute gain of 13.1% on a single-view setup. On the CALVIN benchmark, it outperforms the current SOTA in the setting of zero-shot unseen scene generalization by being able to successfully run 0.2 more tasks, a 7% relative increase. It also works in the real world from a handful of demonstrations. We ablate our model's architectural design choices, such as 3D scene featurization and 3D relative attentions, and show they all help generalization. Our results suggest that 3D scene representations and powerful generative modeling are keys to efficient robot learning from demonstrations.

--------------------------------------------------------------------------------------------------------

Multi-modal preference alignment remedies regression of visual instruction tuning on language model

The paper examines how to tune large language models for visual instruction without losing language capabilities, using preference learning on a small specialized dataset. This is important for deploying multi-modal AI systems that can handle both text and images.

Authors:  Shengzhi Li, Rongyu Lin, Shichao Pei

Link:  https://arxiv.org/abs/2402.10884v1

Date: 2024-02-16

Summary:

In production, multi-modal large language models (MLLMs) are expected to support multi-turn queries of interchanging image and text modalities. However, the current MLLMs trained with visual-question-answering (VQA) datasets could suffer from degradation, as VQA datasets lack the diversity and complexity of the original text instruction datasets which the underlying language model had been trained with. To address this challenging degradation, we first collect a lightweight (6k entries) VQA preference dataset where answers were annotated by Gemini for 5 quality metrics in a granular fashion, and investigate standard Supervised Fine-tuning, rejection sampling, Direct Preference Optimization (DPO), and SteerLM. Our findings indicate that the with DPO we are able to surpass instruction-following capabilities of the language model, achieving a 6.73 score on MT-Bench, compared to Vicuna's 6.57 and LLaVA's 5.99 despite small data scale. This enhancement in textual instruction proficiency correlates with boosted visual instruction performance (+4.9\% on MM-Vet, +6\% on LLaVA-Bench), with minimal alignment tax on visual knowledge benchmarks compared to previous RLHF approach. In conclusion, we propose a distillation-based multi-modal alignment model with fine-grained annotations on a small dataset that reconciles the textual and visual performance of MLLMs, restoring and boosting language capability after visual instruction tuning.

--------------------------------------------------------------------------------------------------------

In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

The paper benchmarks different AI techniques on processing extremely long documents. Recurrent memory networks are shown to outperform transformers for sequences over 10 million tokens, enabling new applications in legal tech and research.

Authors:  Yuri Kuratov, Aydar Bulatov, Petr Anokhin, Dmitry Sorokin, Artyom Sorokin, Mikhail Burtsev

Link:  https://arxiv.org/abs/2402.10790v1

Date: 2024-02-16

Summary:

This paper addresses the challenge of processing long documents using generative transformer models. To evaluate different approaches, we introduce BABILong, a new benchmark designed to assess model capabilities in extracting and processing distributed facts within extensive texts. Our evaluation, which includes benchmarks for GPT-4 and RAG, reveals that common methods are effective only for sequences up to $10^4$ elements. In contrast, fine-tuning GPT-2 with recurrent memory augmentations enables it to handle tasks involving up to $10^7$ elements. This achievement marks a substantial leap, as it is by far the longest input processed by any open neural network model to date, demonstrating a significant improvement in the processing capabilities for long sequences.

--------------------------------------------------------------------------------------------------------

Generative AI and Attentive User Interfaces: Five Strategies to Enhance Take-Over Quality in Automated Driving

The paper proposes using generative AI to subtly improve driver situation awareness in automated vehicles, facilitating safer takeovers. This human-centered approach could help build trust in self-driving cars.

Authors:  Patrick Ebel

Link:  https://arxiv.org/abs/2402.10664v1

Date: 2024-02-16

Summary:

As the automotive world moves toward higher levels of driving automation, Level 3 automated driving represents a critical juncture. In Level 3 driving, vehicles can drive alone under limited conditions, but drivers are expected to be ready to take over when the system requests. Assisting the driver to maintain an appropriate level of Situation Awareness (SA) in such contexts becomes a critical task. This position paper explores the potential of Attentive User Interfaces (AUIs) powered by generative Artificial Intelligence (AI) to address this need. Rather than relying on overt notifications, we argue that AUIs based on novel AI technologies such as large language models or diffusion models can be used to improve SA in an unconscious and subtle way without negative effects on drivers overall workload. Accordingly, we propose 5 strategies how generative AI s can be used to improve the quality of takeovers and, ultimately, road safety.

--------------------------------------------------------------------------------------------------------

InSaAF: Incorporating Safety through Accuracy and Fairness | Are LLMs ready for the Indian Legal Domain?

The paper evaluates whether large language models exhibit societal biases when making legal judgments in India, proposing metrics to assess model safety. Ensuring fairness is crucial as AI enters high-stakes domains like law.

Authors:  Yogesh Tripathi, Raghav Donakanti, Sahil Girhepuje, Ishan Kavathekar, Bhaskara Hanuma Vedula, Gokul S Krishnan, Shreya Goyal, Anmol Goel, Balaraman Ravindran, Ponnurangam Kumaraguru

Link:  https://arxiv.org/abs/2402.10567v1

Date: 2024-02-16

Summary:

Recent advancements in language technology and Artificial Intelligence have resulted in numerous Language Models being proposed to perform various tasks in the legal domain ranging from predicting judgments to generating summaries. Despite their immense potential, these models have been proven to learn and exhibit societal biases and make unfair predictions. In this study, we explore the ability of Large Language Models (LLMs) to perform legal tasks in the Indian landscape when social factors are involved. We present a novel metric, $\beta$-weighted $\textit{Legal Safety Score ($LSS_{\beta}$)}$, which encapsulates both the fairness and accuracy aspects of the LLM. We assess LLMs' safety by considering its performance in the $\textit{Binary Statutory Reasoning}$ task and its fairness exhibition with respect to various axes of disparities in the Indian society. Task performance and fairness scores of LLaMA and LLaMA--2 models indicate that the proposed $LSS_{\beta}$ metric can effectively determine the readiness of a model for safe usage in the legal sector. We also propose finetuning pipelines, utilising specialised legal datasets, as a potential method to mitigate bias and improve model safety. The finetuning procedures on LLaMA and LLaMA--2 models increase the $LSS_{\beta}$, improving their usability in the Indian legal domain. Our code is publicly released.

--------------------------------------------------------------------------------------------------------

LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models

The paper presents an interactive tool for analyzing large language model evaluations, enabling easier model comparisons. This could accelerate research and development cycles in natural language processing.

Authors:  Minsuk Kahng, Ian Tenney, Mahima Pushkarna, Michael Xieyang Liu, James Wexler, Emily Reif, Krystal Kallarackal, Minsuk Chang, Michael Terry, Lucas Dixon

Link:  https://arxiv.org/abs/2402.10524v1

Date: 2024-02-16

Summary:

Automatic side-by-side evaluation has emerged as a promising approach to evaluating the quality of responses from large language models (LLMs). However, analyzing the results from this evaluation approach raises scalability and interpretability challenges. In this paper, we present LLM Comparator, a novel visual analytics tool for interactively analyzing results from automatic side-by-side evaluation. The tool supports interactive workflows for users to understand when and why a model performs better or worse than a baseline model, and how the responses from two models are qualitatively different. We iteratively designed and developed the tool by closely working with researchers and engineers at a large technology company. This paper details the user challenges we identified, the design and development of the tool, and an observational study with participants who regularly evaluate their models.

--------------------------------------------------------------------------------------------------------

Darwin Turing Dawkins: Building a General Theory of Evolution

The paper draws cross-disciplinary links between evolution in biology, technology, and culture using insights from Darwin, Turing and Dawkins. An ambitious attempt at a general theory of evolutionary processes.

Authors:  Leonard M. Adleman

Link:  https://arxiv.org/abs/2402.10393v1

Date: 2024-02-16

Summary:

Living things, computers, societies, and even books are part of a grand evolutionary struggle to survive. That struggle shapes nature, nations, religions, art, science, and you. What you think, feel, and do is determined by it. Darwinian evolution does not apply solely to the genes that are stored in DNA. Using the insights of Alan Turing and Richard Dawkins, we will see that it also applies to the memes we store in our brains and the information we store in our computers. The next time you run for president, fight a war, or just deal with the ordinary problems humans are heir to, perhaps this book will be of use. If you want to understand why and when you will die, or if you want to achieve greatness this book may help. If you are concerned about where the computer revolution is headed, this book may provide some answers.

--------------------------------------------------------------------------------------------------------

Improvising Age Verification Technologies in Canada: Technical, Regulatory and Social Dynamics

The paper examines technical, regulatory and social factors involved in deploying biometric age verification technologies in Canada. Understanding the broader context is key for successfully implementing AI systems.

Authors:  Azfar Adib, Wei-Ping Zhu, M. Omair Ahmad

Link:  https://arxiv.org/abs/2402.10388v1

Date: 2024-02-16

Summary:

Age verification, which is a mandatory legal requirement for delivering certain age-appropriate services or products, has recently been emphasized around the globe to ensure online safety for children. The rapid advancement of artificial intelligence has facilitated the recent development of some cutting-edge age-verification technologies, particularly using biometrics. However, successful deployment and mass acceptance of these technologies are significantly dependent on the corresponding socio-economic and regulatory context. This paper reviews such key dynamics for improvising age-verification technologies in Canada. It is particularly essential for such technologies to be inclusive, transparent, adaptable, privacy-preserving, and secure. Effective collaboration between academia, government, and industry entities can help to meet the growing demands for age-verification services in Canada while maintaining a user-centric approach.

--------------------------------------------------------------------------------------------------------

Exploring RIS Coverage Enhancement in Factories: From Ray-Based Modeling to Use-Case Analysis

The paper models how reconfigurable intelligent surfaces could enhance wireless coverage in factories for 5G/6G networks. This could enable new industrial use cases requiring reliable low-latency communications.

Authors:  Gurjot Singh Bhatia, Yoann Corre, Thierry Tenoux, M. Di Renzo

Link:  https://arxiv.org/abs/2402.10386v1

Date: 2024-02-16

Summary:

Reconfigurable Intelligent Surfaces (RISs) have risen to the forefront of wireless communications research due to their proactive ability to alter the wireless environment intelligently, promising improved wireless network capacity and coverage. Thus, RISs are a pivotal technology in evolving next-generation communication networks. This paper demonstrates a system-level modeling approach for RIS. The RIS model, integrated with the Volcano ray-tracing (RT) tool, is used to analyze the far-field (FF) RIS channel properties in a typical factory environment and explore coverage enhancement at sub-6 GHz and mmWave frequencies. The results obtained in non-line-of-sight (NLoS) scenarios confirm that RIS application is relevant for 5G industrial networks.

--------------------------------------------------------------------------------------------------------

Backdoor Attack against One-Class Sequential Anomaly Detection Models

The paper demonstrates backdoor attacks that compromise sequential anomaly detection models by implanting malicious triggers. Defending against such vulnerabilities is essential as anomaly detection enters real-world deployment.

Authors:  He Cheng, Shuhan Yuan

Link:  https://arxiv.org/abs/2402.10283v1

Date: 2024-02-15

Summary:

Deep anomaly detection on sequential data has garnered significant attention due to the wide application scenarios. However, deep learning-based models face a critical security threat - their vulnerability to backdoor attacks. In this paper, we explore compromising deep sequential anomaly detection models by proposing a novel backdoor attack strategy. The attack approach comprises two primary steps, trigger generation and backdoor injection. Trigger generation is to derive imperceptible triggers by crafting perturbed samples from the benign normal data, of which the perturbed samples are still normal. The backdoor injection is to properly inject the backdoor triggers to comprise the model only for the samples with triggers. The experimental results demonstrate the effectiveness of our proposed attack strategy by injecting backdoors on two well-established one-class anomaly detection models.

--------------------------------------------------------------------------------------------------------

Brant-2: Foundation Model for Brain Signals

The paper presents Brant-2, a large foundation model for analyzing diverse brain signal data. By pre-training on unlabeled data, it could enable a wide range of neuroscience and medical applications without costly labeling.

Authors:  Zhizhang Yuan, Daoze Zhang, Junru Chen, Geifei Gu, Yang Yang

Link:  https://arxiv.org/abs/2402.10251v1

Date: 2024-02-15

Summary:

Foundational models benefit from pre-training on large amounts of unlabeled data and enable strong performance in a wide variety of applications with a small amount of labeled data. Such models can be particularly effective in analyzing brain signals, as this field encompasses numerous application scenarios, and it is costly to perform large-scale annotation. In this work, we present the largest foundation model in brain signals, Brant-2. Compared to Brant, a foundation model designed for intracranial neural signals, Brant-2 not only exhibits robustness towards data variations and modeling scales but also can be applied to a broader range of brain neural data. By experimenting on an extensive range of tasks, we demonstrate that Brant-2 is adaptive to various application scenarios in brain signals. Further analyses reveal the scalability of the Brant-2, validate each component's effectiveness, and showcase our model's ability to maintain performance in scenarios with scarce labels. The source code and pre-trained weights are available at: https://anonymous.4open.science/r/Brant-2-5843.

--------------------------------------------------------------------------------------------------------

System-level Impact of Non-Ideal Program-Time of Charge Trap Flash (CTF) on Deep Neural Network

The paper examines how non-ideal programming timing in novel hardware impacts deep neural network training. Compensating for these effects is key to enabling efficient on-chip learning on resource-constrained edge devices.

Authors:  S. Shrivastava, A. Biswas, S. Chakrabarty, G. Dash, V. Saraswat, U. Ganguly

Link:  https://arxiv.org/abs/2402.09792v1

Date: 2024-02-15

Summary:

Learning of deep neural networks (DNN) using Resistive Processing Unit (RPU) architecture is energy-efficient as it utilizes dedicated neuromorphic hardware and stochastic computation of weight updates for in-memory computing. Charge Trap Flash (CTF) devices can implement RPU-based weight updates in DNNs. However, prior work has shown that the weight updates (V_T) in CTF-based RPU are impacted by the non-ideal program time of CTF. The non-ideal program time is affected by two factors of CTF. Firstly, the effects of the number of input pulses (N) or pulse width (pw), and secondly, the gap between successive update pulses (t_gap) used for the stochastic computation of weight updates. Therefore, the impact of this non-ideal program time must be studied for neural network training simulations. In this study, Firstly, we propose a pulse-train design compensation technique to reduce the total error caused by non-ideal program time of CTF and stochastic variance of a network. Secondly, we simulate RPU-based DNN with non-ideal program time of CTF on MNIST and Fashion-MNIST datasets. We find that for larger N (~1000), learning performance approaches the ideal (software-level) training level and, therefore, is not much impacted by the choice of t_gap used to implement RPU-based weight updates. However, for lower N (<500), learning performance depends on T_gap of the pulses. Finally, we also performed an ablation study to isolate the causal factor of the improved learning performance. We conclude that the lower noise level in the weight updates is the most likely significant factor to improve the learning performance of DNN. Thus, our study attempts to compensate for the error caused by non-ideal program time and standardize the pulse length (N) and pulse gap (t_gap) specifications for CTF-based RPUs for accurate system-level on-chip training.

--------------------------------------------------------------------------------------------------------

Grounding Language Model with Chunking-Free In-Context Retrieval

The paper introduces a chunking-free approach to evidence retrieval for question answering systems. By streamlining retrieval, it could improve performance on real-world open domain QA, facilitating trustworthy conversational AI.

Authors:  Hongjin Qian, Zheng Liu, Kelong Mao, Yujia Zhou, Zhicheng Dou

Link:  https://arxiv.org/abs/2402.09760v1

Date: 2024-02-15

Summary:

This paper presents a novel Chunking-Free In-Context (CFIC) retrieval approach, specifically tailored for Retrieval-Augmented Generation (RAG) systems. Traditional RAG systems often struggle with grounding responses using precise evidence text due to the challenges of processing lengthy documents and filtering out irrelevant content. Commonly employed solutions, such as document chunking and adapting language models to handle longer contexts, have their limitations. These methods either disrupt the semantic coherence of the text or fail to effectively address the issues of noise and inaccuracy in evidence retrieval.   CFIC addresses these challenges by circumventing the conventional chunking process. It utilizes the encoded hidden states of documents for in-context retrieval, employing auto-aggressive decoding to accurately identify the specific evidence text required for user queries, eliminating the need for chunking. CFIC is further enhanced by incorporating two decoding strategies, namely Constrained Sentence Prefix Decoding and Skip Decoding. These strategies not only improve the efficiency of the retrieval process but also ensure that the fidelity of the generated grounding text evidence is maintained. Our evaluations of CFIC on a range of open QA datasets demonstrate its superiority in retrieving relevant and accurate evidence, offering a significant improvement over traditional methods. By doing away with the need for document chunking, CFIC presents a more streamlined, effective, and efficient retrieval solution, making it a valuable advancement in the field of RAG systems.

--------------------------------------------------------------------------------------------------------

Model Compression and Efficient Inference for Large Language Models: A Survey

The paper surveys compression techniques to enable large language model deployment on edge devices. As models grow ever larger, efficient inference is crucial for real-world viability across languages, modalities and tasks.

Authors:  Wenxiao Wang, Wei Chen, Yicong Luo, Yongliu Long, Zhengkai Lin, Liye Zhang, Binbin Lin, Deng Cai, Xiaofei He

Link:  https://arxiv.org/abs/2402.09748v1

Date: 2024-02-15

Summary:

Transformer based large language models have achieved tremendous success. However, the significant memory and computational costs incurred during the inference process make it challenging to deploy large models on resource-constrained devices. In this paper, we investigate compression and efficient inference methods for large language models from an algorithmic perspective. Regarding taxonomy, similar to smaller models, compression and acceleration algorithms for large language models can still be categorized into quantization, pruning, distillation, compact architecture design, dynamic networks. However, Large language models have two prominent characteristics compared to smaller models: (1) Most of compression algorithms require finetuning or even retraining the model after compression. The most notable aspect of large models is the very high cost associated with model finetuning or training. Therefore, many algorithms for large models, such as quantization and pruning, start to explore tuning-free algorithms. (2) Large models emphasize versatility and generalization rather than performance on a single task. Hence, many algorithms, such as knowledge distillation, focus on how to preserving their versatility and generalization after compression. Since these two characteristics were not very pronounced in early large models, we further distinguish large language models into medium models and ``real'' large models. Additionally, we also provide an introduction to some mature frameworks for efficient inference of large models, which can support basic compression or acceleration algorithms, greatly facilitating model deployment for users.

--------------------------------------------------------------------------------------------------------

Agents Need Not Know Their Purpose

The paper proposes "oblivious" agent architectures that implicitly learn human preferences over time. If validated, such techniques could significantly advance AI value alignment, ensuring beneficial behaviors as intelligence scales up.

Authors:  Paulo Garcia

Link:  https://arxiv.org/abs/2402.09734v1

Date: 2024-02-15

Summary:

Ensuring artificial intelligence behaves in such a way that is aligned with human values is commonly referred to as the alignment challenge. Prior work has shown that rational agents, behaving in such a way that maximizes a utility function, will inevitably behave in such a way that is not aligned with human values, especially as their level of intelligence goes up. Prior work has also shown that there is no "one true utility function"; solutions must include a more holistic approach to alignment. This paper describes oblivious agents: agents that are architected in such a way that their effective utility function is an aggregation of a known and hidden sub-functions. The hidden component, to be maximized, is internally implemented as a black box, preventing the agent from examining it. The known component, to be minimized, is knowledge of the hidden sub-function. Architectural constraints further influence how agent actions can evolve its internal environment model. We show that an oblivious agent, behaving rationally, constructs an internal approximation of designers' intentions (i.e., infers alignment), and, as a consequence of its architecture and effective utility function, behaves in such a way that maximizes alignment; i.e., maximizing the approximated intention function. We show that, paradoxically, it does this for whatever utility function is used as the hidden component and, in contrast with extant techniques, chances of alignment actually improve as agent intelligence grows.

--------------------------------------------------------------------------------------------------------

Persuading a Learning Agent

The paper models human persuasion of learning agents, offering insights for designing algorithms robust to exploitation. Understanding strategic interactions is vital as AI permeates high-stakes decisions in finance, healthcare and beyond.

Authors:  Tao Lin, Yiling Chen

Link:  https://arxiv.org/abs/2402.09721v1

Date: 2024-02-15

Summary:

We study a repeated Bayesian persuasion problem (and more generally, any generalized principal-agent problem with complete information) where the principal does not have commitment power and the agent uses algorithms to learn to respond to the principal's signals. We reduce this problem to a one-shot generalized principal-agent problem with an approximately-best-responding agent. This reduction allows us to show that: if the agent uses contextual no-regret learning algorithms, then the principal can guarantee a utility that is arbitrarily close to the principal's optimal utility in the classic non-learning model with commitment; if the agent uses contextual no-swap-regret learning algorithms, then the principal cannot obtain any utility significantly more than the optimal utility in the non-learning model with commitment. The difference between the principal's obtainable utility in the learning model and the non-learning model is bounded by the agent's regret (swap-regret). If the agent uses mean-based learning algorithms (which can be no-regret but not no-swap-regret), then the principal can do significantly better than the non-learning model. These conclusions hold not only for Bayesian persuasion, but also for any generalized principal-agent problem with complete information, including Stackelberg games and contract design.

--------------------------------------------------------------------------------------------------------

The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse

The paper reveals how even minor edits can severely degrade large language model performance. Establishing robust model editing techniques is thus essential before deployment in production systems.

Authors:  Wanli Yang, Fei Sun, Xinyu Ma, Xun Liu, Dawei Yin, Xueqi Cheng

Link:  https://arxiv.org/abs/2402.09656v1

Date: 2024-02-15

Summary:

Although model editing has shown promise in revising knowledge in Large Language Models (LLMs), its impact on the inherent capabilities of LLMs is often overlooked. In this work, we reveal a critical phenomenon: even a single edit can trigger model collapse, manifesting as significant performance degradation in various benchmark tasks. However, benchmarking LLMs after each edit, while necessary to prevent such collapses, is impractically time-consuming and resource-intensive. To mitigate this, we propose using perplexity as a surrogate metric, validated by extensive experiments demonstrating its strong correlation with downstream task performance. We further conduct an in-depth study on sequential editing, a practical setting for real-world scenarios, across various editing methods and LLMs, focusing on hard cases from our previous single edit studies. The results indicate that nearly all examined editing methods result in model collapse after only few edits. To facilitate further research, we have utilized ChatGPT to develop a new dataset, HardCF, based on those hard cases. This dataset aims to establish the foundation for pioneering research in reliable model editing and the mechanisms underlying editing-induced model collapse. We hope this work can draw the community's attention to the potential risks inherent in model editing practices.

--------------------------------------------------------------------------------------------------------

Orthogonal Time Frequency Space for Integrated Sensing and Communication: A Survey

The paper surveys orthogonal time frequency space waveforms for dual communication and sensing in 6G networks. By allowing joint signal optimization, this could enable new integrated applications like self-driving vehicles.

Authors:  Eyad Shtaiwi, Ahmed Abdelhadi, Husheng Li, Zhu Han, H. Vincent Poor

Link:  https://arxiv.org/abs/2402.09637v1

Date: 2024-02-15

Summary:

Sixth-generation (6G) wireless communication systems, as stated in the European 6G flagship project Hexa-X, are anticipated to feature the integration of intelligence, communication, sensing, positioning, and computation. An important aspect of this integration is integrated sensing and communication (ISAC), in which the same waveform is used for both systems both sensing and communication, to address the challenge of spectrum scarcity. Recently, the orthogonal time frequency space (OTFS) waveform has been proposed to address OFDM's limitations due to the high Doppler spread in some future wireless communication systems. In this paper, we review existing OTFS waveforms for ISAC systems and provide some insights into future research. Firstly, we introduce the basic principles and a system model of OTFS and provide a foundational understanding of this innovative technology's core concepts and architecture. Subsequently, we present an overview of OTFS-based ISAC system frameworks. We provide a comprehensive review of recent research developments and the current state of the art in the field of OTFS-assisted ISAC systems to gain a thorough understanding of the current landscape and advancements. Furthermore, we perform a thorough comparison between OTFS-enabled ISAC operations and traditional OFDM, highlighting the distinctive advantages of OTFS, especially in high Doppler spread scenarios. Subsequently, we address the primary challenges facing OTFS-based ISAC systems, identifying potential limitations and drawbacks. Then, finally, we suggest future research directions, aiming to inspire further innovation in the 6G wireless communication landscape.

--------------------------------------------------------------------------------------------------------

Persuasion, Delegation, and Private Information in Algorithm-Assisted Decisions

The paper examines algorithmic persuasion and human judgement, offering guidance for beneficial collaboration between AI systems and people. Effectively combining predictions and domain expertise is key to unlocking the power of artificial intelligence.

Authors:  Ruqing Xu

Link:  https://arxiv.org/abs/2402.09384v1

Date: 2024-02-14

Summary:

A principal designs an algorithm that generates a publicly observable prediction of a binary state. She must decide whether to act directly based on the prediction or to delegate the decision to an agent with private information but potential misalignment. We study the optimal design of the prediction algorithm and the delegation rule in such environments. Three key findings emerge: (1) Delegation is optimal if and only if the principal would make the same binary decision as the agent had she observed the agent's information. (2) Providing the most informative algorithm may be suboptimal even if the principal can act on the algorithm's prediction. Instead, the optimal algorithm may provide more information about one state and restrict information about the other. (3) Common restrictions on algorithms, such as keeping a "human-in-the-loop" or requiring maximal prediction accuracy, strictly worsen decision quality in the absence of perfectly aligned agents and state-revealing signals. These findings predict the underperformance of human-machine collaborations if no measures are taken to mitigate common preference misalignment between algorithms and human decision-makers.

--------------------------------------------------------------------------------------------------------

Single-Reset Divide & Conquer Imitation Learning

Proposed Single-Reset Divide & Conquer approach enables efficient learning of complex robot policies from just one demonstration, without needing multiple environment resets during training. Relaxes reset assumptions of prior imitation learning techniques, offering promise as a step towards versatile robot programming through non-expert human guidance.

Authors:  Alexandre Chenu, Olivier Serris, Olivier Sigaud, Nicolas Perrin-Gilbert

Link:  https://arxiv.org/abs/2402.09355v1

Date: 2024-02-14

Summary:

Demonstrations are commonly used to speed up the learning process of Deep Reinforcement Learning algorithms. To cope with the difficulty of accessing multiple demonstrations, some algorithms have been developed to learn from a single demonstration. In particular, the Divide & Conquer Imitation Learning algorithms leverage a sequential bias to learn a control policy for complex robotic tasks using a single state-based demonstration. The latest version, DCIL-II demonstrates remarkable sample efficiency. This novel method operates within an extended Goal-Conditioned Reinforcement Learning framework, ensuring compatibility between intermediate and subsequent goals extracted from the demonstration. However, a fundamental limitation arises from the assumption that the system can be reset to specific states along the demonstrated trajectory, confining the application to simulated systems. In response, we introduce an extension called Single-Reset DCIL (SR-DCIL), designed to overcome this constraint by relying on a single initial state reset rather than sequential resets. To address this more challenging setting, we integrate two mechanisms inspired by the Learning from Demonstrations literature, including a Demo-Buffer and Value Cloning, to guide the agent toward compatible success states. In addition, we introduce Approximate Goal Switching to facilitate training to reach goals distant from the reset state. Our paper makes several contributions, highlighting the importance of the reset assumption in DCIL-II, presenting the mechanisms of SR-DCIL variants and evaluating their performance in challenging robotic tasks compared to DCIL-II. In summary, this work offers insights into the significance of reset assumptions in the framework of DCIL and proposes SR-DCIL, a first step toward a versatile algorithm capable of learning control policies under a weaker reset assumption.

--------------------------------------------------------------------------------------------------------


EYE ON A.I. GETS READERS UP TO DATE ON THE LATEST FUNDING NEWS AND RELATED ISSUES. SUBSCRIBE FOR THE WEEKLY NEWSLETTER.