Week Ending 1.14.2024

 

RESEARCH WATCH: 1.14.2024

SPONSORED BY

Digimarc digital watermarks invisibly guard your digital assets to protect against misuse, prove copyright ownership, and verify authenticity. In an era of artificial intelligence, don’t leave your images and other digital content exposed. Demand superior content protection and maintain trust in your brand with Digimarc.

Checkout Digimarc - https://www.digimarc.com/

 

A Brain-inspired Computational Model for Human-like Concept Learning

The paper by Wang and Zeng introduces a brain-inspired computational model for human-like concept learning. This could allow AI systems to acquire concepts in a more natural, human-like way, with potential applications in areas like reasoning and decision-making.

Authors:  Yuwei Wang, Yi Zeng

Link:  https://arxiv.org/abs/2401.06471v1

Date: 2024-01-12

Summary:

Concept learning is a fundamental aspect of human cognition and plays a critical role in mental processes such as categorization, reasoning, memory, and decision-making. Researchers across various disciplines have shown consistent interest in the process of concept acquisition in individuals. To elucidate the mechanisms involved in human concept learning, this study examines the findings from computational neuroscience and cognitive psychology. These findings indicate that the brain's representation of concepts relies on two essential components: multisensory representation and text-derived representation. These two types of representations are coordinated by a semantic control system, ultimately leading to the acquisition of concepts. Drawing inspiration from this mechanism, the study develops a human-like computational model for concept learning based on spiking neural networks. By effectively addressing the challenges posed by diverse sources and imbalanced dimensionality of the two forms of concept representations, the study successfully attains human-like concept representations. Tests involving similar concepts demonstrate that our model, which mimics the way humans learn concepts, yields representations that closely align with human cognition.

--------------------------------------------------------------------------------------------------------

3D-PreMise: Can Large Language Models Generate 3D Shapes with Sharp Features and Parametric Control?

The paper by Yuan et al. explores using large language models to generate 3D shapes with precise geometries and parametric control. This could enable industrial design and manufacturing applications requiring complex CAD models.

Authors:  Zeqing Yuan, Haoxuan Lan, Qiang Zou, Junbo Zhao

Link:  https://arxiv.org/abs/2401.06437v1

Date: 2024-01-12

Summary:

Recent advancements in implicit 3D representations and generative models have markedly propelled the field of 3D object generation forward. However, it remains a significant challenge to accurately model geometries with defined sharp features under parametric controls, which is crucial in fields like industrial design and manufacturing. To bridge this gap, we introduce a framework that employs Large Language Models (LLMs) to generate text-driven 3D shapes, manipulating 3D software via program synthesis. We present 3D-PreMise, a dataset specifically tailored for 3D parametric modeling of industrial shapes, designed to explore state-of-the-art LLMs within our proposed pipeline. Our work reveals effective generation strategies and delves into the self-correction capabilities of LLMs using a visual interface. Our work highlights both the potential and limitations of LLMs in 3D parametric modeling for industrial applications.

--------------------------------------------------------------------------------------------------------

Uncertainty quantification for probabilistic machine learning in earth observation using conformal prediction

The paper by Singh et al. applies conformal prediction for uncertainty quantification in Earth observation models. This could improve reliability for applications like environmental monitoring and decision-making.

Authors:  Geethen Singh, Glenn Moncrieff, Zander Venter, Kerry Cawse-Nicholson, Jasper Slingsby, Tamara B Robinson

Link:  https://arxiv.org/abs/2401.06421v1

Date: 2024-01-12

Summary:

Unreliable predictions can occur when using artificial intelligence (AI) systems with negative consequences for downstream applications, particularly when employed for decision-making. Conformal prediction provides a model-agnostic framework for uncertainty quantification that can be applied to any dataset, irrespective of its distribution, post hoc. In contrast to other pixel-level uncertainty quantification methods, conformal prediction operates without requiring access to the underlying model and training dataset, concurrently offering statistically valid and informative prediction regions, all while maintaining computational efficiency. In response to the increased need to report uncertainty alongside point predictions, we bring attention to the promise of conformal prediction within the domain of Earth Observation (EO) applications. To accomplish this, we assess the current state of uncertainty quantification in the EO domain and found that only 20% of the reviewed Google Earth Engine (GEE) datasets incorporated a degree of uncertainty information, with unreliable methods prevalent. Next, we introduce modules that seamlessly integrate into existing GEE predictive modelling workflows and demonstrate the application of these tools for datasets spanning local to global scales, including the Dynamic World and Global Ecosystem Dynamics Investigation (GEDI) datasets. These case studies encompass regression and classification tasks, featuring both traditional and deep learning-based workflows. Subsequently, we discuss the opportunities arising from the use of conformal prediction in EO. We anticipate that the increased availability of easy-to-use implementations of conformal predictors, such as those provided here, will drive wider adoption of rigorous uncertainty quantification in EO, thereby enhancing the reliability of uses such as operational monitoring and decision making.

--------------------------------------------------------------------------------------------------------

E$^{2}$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation

The paper by Gong et al. proposes techniques to efficiently distill GANs for image-to-image translation from diffusion models. This could enable real-time high-quality image editing capabilities on mobile devices.

Authors:  Yifan Gong, Zheng Zhan, Qing Jin, Yanyu Li, Yerlan Idelbayev, Xian Liu, Andrey Zharkov, Kfir Aberman, Sergey Tulyakov, Yanzhi Wang, Jian Ren

Link:  https://arxiv.org/abs/2401.06127v1

Date: 2024-01-11

Summary:

One highly promising direction for enabling flexible real-time on-device image editing is utilizing data distillation by leveraging large-scale text-to-image diffusion models, such as Stable Diffusion, to generate paired datasets used for training generative adversarial networks (GANs). This approach notably alleviates the stringent requirements typically imposed by high-end commercial GPUs for performing image editing with diffusion models. However, unlike text-to-image diffusion models, each distilled GAN is specialized for a specific image editing task, necessitating costly training efforts to obtain models for various concepts. In this work, we introduce and address a novel research direction: can the process of distilling GANs from diffusion models be made significantly more efficient? To achieve this goal, we propose a series of innovative techniques. First, we construct a base GAN model with generalized features, adaptable to different concepts through fine-tuning, eliminating the need for training from scratch. Second, we identify crucial layers within the base GAN model and employ Low-Rank Adaptation (LoRA) with a simple yet effective rank search process, rather than fine-tuning the entire base model. Third, we investigate the minimal amount of data necessary for fine-tuning, further reducing the overall training time. Extensive experiments show that we can efficiently empower GANs with the ability to perform real-time high-quality image editing on mobile devices with remarkable reduced training cost and storage for each concept.

--------------------------------------------------------------------------------------------------------

Investigating Data Contamination for Pre-training Language Models

The paper by Jiang et al. investigates data contamination in pre-training language models. Understanding these effects is important for developing robust models not reliant on evaluation data contamination.

Authors:  Minhao Jiang, Ken Ziyu Liu, Ming Zhong, Rylan Schaeffer, Siru Ouyang, Jiawei Han, Sanmi Koyejo

Link:  https://arxiv.org/abs/2401.06059v1

Date: 2024-01-11

Summary:

Language models pre-trained on web-scale corpora demonstrate impressive capabilities on diverse downstream tasks. However, there is increasing concern whether such capabilities might arise from evaluation datasets being included in the pre-training corpus -- a phenomenon known as \textit{data contamination} -- in a manner that artificially increases performance. There has been little understanding of how this potential contamination might influence LMs' performance on downstream tasks. In this paper, we explore the impact of data contamination at the pre-training stage by pre-training a series of GPT-2 models \textit{from scratch}. We highlight the effect of both text contamination (\textit{i.e.}\ input text of the evaluation samples) and ground-truth contamination (\textit{i.e.}\ the prompts asked on the input and the desired outputs) from evaluation data. We also investigate the effects of repeating contamination for various downstream tasks. Additionally, we examine the prevailing n-gram-based definitions of contamination within current LLM reports, pinpointing their limitations and inadequacy. Our findings offer new insights into data contamination's effects on language model capabilities and underscore the need for independent, comprehensive contamination assessments in LLM studies.

--------------------------------------------------------------------------------------------------------

Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments

The paper by Dedieu et al. extracts cognitive maps from transformer representations for efficient planning in partially observed environments. This could enable navigation and path planning in complex, perceptually aliased environments.

Authors:  Antoine Dedieu, Wolfgang Lehrach, Guangyao Zhou, Dileep George, Miguel Lázaro-Gredilla

Link:  https://arxiv.org/abs/2401.05946v1

Date: 2024-01-11

Summary:

Despite their stellar performance on a wide range of tasks, including in-context tasks only revealed during inference, vanilla transformers and variants trained for next-token predictions (a) do not learn an explicit world model of their environment which can be flexibly queried and (b) cannot be used for planning or navigation. In this paper, we consider partially observed environments (POEs), where an agent receives perceptually aliased observations as it navigates, which makes path planning hard. We introduce a transformer with (multiple) discrete bottleneck(s), TDB, whose latent codes learn a compressed representation of the history of observations and actions. After training a TDB to predict the future observation(s) given the history, we extract interpretable cognitive maps of the environment from its active bottleneck(s) indices. These maps are then paired with an external solver to solve (constrained) path planning problems. First, we show that a TDB trained on POEs (a) retains the near perfect predictive performance of a vanilla transformer or an LSTM while (b) solving shortest path problems exponentially faster. Second, a TDB extracts interpretable representations from text datasets, while reaching higher in-context accuracy than vanilla sequence models. Finally, in new POEs, a TDB (a) reaches near-perfect in-context accuracy, (b) learns accurate in-context cognitive maps (c) solves in-context path planning problems.

--------------------------------------------------------------------------------------------------------

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

The paper by Cui et al. provides a taxonomy analyzing risks in large language models and mitigation strategies. This comprehensive perspective could aid development of responsible, secure large language models.

Authors:  Tianyu Cui, Yanling Wang, Chuanpu Fu, Yong Xiao, Sijia Li, Xinhao Deng, Yunpeng Liu, Qinglin Zhang, Ziyi Qiu, Peiyang Li, Zhixing Tan, Junwu Xiong, Xinyu Kong, Zujie Wen, Ke Xu, Qi Li

Link:  https://arxiv.org/abs/2401.05778v1

Date: 2024-01-11

Summary:

Large language models (LLMs) have strong capabilities in solving diverse natural language processing tasks. However, the safety and security issues of LLM systems have become the major obstacle to their widespread application. Many studies have extensively investigated risks in LLM systems and developed the corresponding mitigation strategies. Leading-edge enterprises such as OpenAI, Google, Meta, and Anthropic have also made lots of efforts on responsible LLMs. Therefore, there is a growing need to organize the existing studies and establish comprehensive taxonomies for the community. In this paper, we delve into four essential modules of an LLM system, including an input module for receiving prompts, a language model trained on extensive corpora, a toolchain module for development and deployment, and an output module for exporting LLM-generated content. Based on this, we propose a comprehensive taxonomy, which systematically analyzes potential risks associated with each module of an LLM system and discusses the corresponding mitigation strategies. Furthermore, we review prevalent benchmarks, aiming to facilitate the risk assessment of LLM systems. We hope that this paper can help LLM participants embrace a systematic perspective to build their responsible LLM systems.

--------------------------------------------------------------------------------------------------------

A Deep Learning Representation of Spatial Interaction Model for Resilient Spatial Planning of Community Business Clusters

The paper by Hao and Wang applies graph neural networks to model interactions between business clusters. This data-driven approach could improve resilience of community business planning.

Authors:  Haiyan Hao, Yan Wang

Link:  https://arxiv.org/abs/2401.04849v1

Date: 2024-01-09

Summary:

Existing Spatial Interaction Models (SIMs) are limited in capturing the complex and context-aware interactions between business clusters and trade areas. To address the limitation, we propose a SIM-GAT model to predict spatiotemporal visitation flows between community business clusters and their trade areas. The model innovatively represents the integrated system of business clusters, trade areas, and transportation infrastructure within an urban region using a connected graph. Then, a graph-based deep learning model, i.e., Graph AttenTion network (GAT), is used to capture the complexity and interdependencies of business clusters. We developed this model with data collected from the Miami metropolitan area in Florida. We then demonstrated its effectiveness in capturing varying attractiveness of business clusters to different residential neighborhoods and across scenarios with an eXplainable AI approach. We contribute a novel method supplementing conventional SIMs to predict and analyze the dynamics of inter-connected community business clusters. The analysis results can inform data-evidenced and place-specific planning strategies helping community business clusters better accommodate their customers across scenarios, and hence improve the resilience of community businesses.

--------------------------------------------------------------------------------------------------------

Agent Alignment in Evolving Social Norms

The paper by Li et al. proposes an evolutionary framework to align agents with evolving social norms. This could allow safer deployment of AI agents in real-world environments where norms change over time.

Authors:  Shimin Li, Tianxiang Sun, Xipeng Qiu

Link:  https://arxiv.org/abs/2401.04620v2

Date: 2024-01-10

Summary:

Agents based on Large Language Models (LLMs) are increasingly permeating various domains of human production and life, highlighting the importance of aligning them with human values. The current alignment of AI systems primarily focuses on passively aligning LLMs through human intervention. However, agents possess characteristics like receiving environmental feedback and self-evolution, rendering the LLM alignment methods inadequate. In response, we propose an evolutionary framework for agent evolution and alignment, named EvolutionaryAgent, which transforms agent alignment into a process of evolution and selection under the principle of survival of the fittest. In an environment where social norms continuously evolve, agents better adapted to the current social norms will have a higher probability of survival and proliferation, while those inadequately aligned dwindle over time. Experimental results assessing the agents from multiple perspectives in aligning with social norms demonstrate that EvolutionaryAgent possesses the capability to align progressively better with the evolving social norms while maintaining its proficiency in general tasks. Effectiveness tests conducted on various open and closed-source LLMs as the foundation for agents also prove the applicability of our approach.

--------------------------------------------------------------------------------------------------------

TwinBooster: Synergising Large Language Models with Barlow Twins and Gradient Boosting for Enhanced Molecular Property Prediction

The paper by Schuh et al. combines language models, Siamese networks, and gradient boosting to predict molecular properties with limited data. This could accelerate drug discovery by enabling property prediction for new molecules.

Authors:  Maximilian G. Schuh, Davide Boldini, Stephan A. Sieber

Link:  https://arxiv.org/abs/2401.04478v1

Date: 2024-01-09

Summary:

The success of drug discovery and development relies on the precise prediction of molecular activities and properties. While in silico molecular property prediction has shown remarkable potential, its use has been limited so far to assays for which large amounts of data are available. In this study, we use a fine-tuned large language model to integrate biological assays based on their textual information, coupled with Barlow Twins, a Siamese neural network using a novel self-supervised learning approach. This architecture uses both assay information and molecular fingerprints to extract the true molecular information. TwinBooster enables the prediction of properties of unseen bioassays and molecules by providing state-of-the-art zero-shot learning tasks. Remarkably, our artificial intelligence pipeline shows excellent performance on the FS-Mol benchmark. This breakthrough demonstrates the application of deep learning to critical property prediction tasks where data is typically scarce. By accelerating the early identification of active molecules in drug discovery and development, this method has the potential to help streamline the identification of novel therapeutics.

--------------------------------------------------------------------------------------------------------

Coupling Graph Neural Networks with Fractional Order Continuous Dynamics: A Robustness Study

The paper by Kang et al. studies the robustness of graph neural networks using fractional calculus for time series modeling. The enhanced robustness to disturbances could enable use in adversarial environments like cybersecurity.

Authors:  Qiyu Kang, Kai Zhao, Yang Song, Yihang Xie, Yanan Zhao, Sijie Wang, Rui She, Wee Peng Tay

Link:  https://arxiv.org/abs/2401.04331v1

Date: 2024-01-09

Summary:

In this work, we rigorously investigate the robustness of graph neural fractional-order differential equation (FDE) models. This framework extends beyond traditional graph neural (integer-order) ordinary differential equation (ODE) models by implementing the time-fractional Caputo derivative. Utilizing fractional calculus allows our model to consider long-term memory during the feature updating process, diverging from the memoryless Markovian updates seen in traditional graph neural ODE models. The superiority of graph neural FDE models over graph neural ODE models has been established in environments free from attacks or perturbations. While traditional graph neural ODE models have been verified to possess a degree of stability and resilience in the presence of adversarial attacks in existing literature, the robustness of graph neural FDE models, especially under adversarial conditions, remains largely unexplored. This paper undertakes a detailed assessment of the robustness of graph neural FDE models. We establish a theoretical foundation outlining the robustness characteristics of graph neural FDE models, highlighting that they maintain more stringent output perturbation bounds in the face of input and graph topology disturbances, compared to their integer-order counterparts. Our empirical evaluations further confirm the enhanced robustness of graph neural FDE models, highlighting their potential in adversarially robust applications.

--------------------------------------------------------------------------------------------------------

AUTOACT: Automatic Agent Learning from Scratch via Self-Planning

The paper by Qiao et al. introduces a framework to automatically train agents from scratch using self-planning. Avoiding human demos could lower barriers to developing capable, reusable agents.

Authors:  Shuofei Qiao, Ningyu Zhang, Runnan Fang, Yujie Luo, Wangchunshu Zhou, Yuchen Eleanor Jiang, Chengfei Lv, Huajun Chen

Link:  https://arxiv.org/abs/2401.05268v1

Date: 2024-01-10

Summary:

Language agents have achieved considerable performance on various complex tasks. Despite the incessant exploration in this field, existing language agent systems still struggle with costly, non-reproducible data reliance and face the challenge of compelling a single model for multiple functions. To this end, we introduce AutoAct, an automatic agent learning framework that does not rely on large-scale annotated data and synthetic trajectories from closed-source models (e.g., GPT-4). Given limited data with a tool library, AutoAct first automatically synthesizes planning trajectories without any assistance from humans or strong closed-source models. Then, AutoAct leverages a division-of-labor strategy to automatically differentiate based on the target task information and synthesized trajectories, producing a sub-agent group to complete the task. We conduct comprehensive experiments with different LLMs, which demonstrates that AutoAct yields better or parallel performance compared to various strong baselines. We even notice that AutoAct, when using the Llama-2-13b model, can achieve performance comparable to that of the GPT-3.5-Turbo agent. Code will be available at https://github.com/zjunlp/AutoAct.

--------------------------------------------------------------------------------------------------------

Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk

The paper by Ulmer et al. proposes using self-talk by language models to generate conversational training data for task-oriented agents. This bootstrapping technique could reduce human annotation costs.

Authors:  Dennis Ulmer, Elman Mansimov, Kaixiang Lin, Justin Sun, Xibin Gao, Yi Zhang

Link:  https://arxiv.org/abs/2401.05033v1

Date: 2024-01-10

Summary:

Large language models (LLMs) are powerful dialogue agents, but specializing them towards fulfilling a specific function can be challenging. Instructing tuning, i.e. tuning models on instruction and sample responses generated by humans (Ouyang et al., 2022), has proven as an effective method to do so, yet requires a number of data samples that a) might not be available or b) costly to generate. Furthermore, this cost increases when the goal is to make the LLM follow a specific workflow within a dialogue instead of single instructions. Inspired by the self-play technique in reinforcement learning and the use of LLMs to simulate human agents, we propose a more effective method for data collection through LLMs engaging in a conversation in various roles. This approach generates a training data via "self-talk" of LLMs that can be refined and utilized for supervised fine-tuning. We introduce an automated way to measure the (partial) success of a dialogue. This metric is used to filter the generated conversational data that is fed back in LLM for training. Based on our automated and human evaluations of conversation quality, we demonstrate that such self-talk data improves results. In addition, we examine the various characteristics that showcase the quality of generated dialogues and how they can be connected to their potential utility as training data.

--------------------------------------------------------------------------------------------------------

Inferring Intentions to Speak Using Accelerometer Data In-the-Wild

The paper by Li et al. studies inferring intentions to speak from wearable sensors. This could improve group coordination for AI assistants guiding discussions.

Authors:  Litian Li, Jord Molhoek, Jing Zhou

Link:  https://arxiv.org/abs/2401.05849v1

Date: 2024-01-11

Summary:

Humans have good natural intuition to recognize when another person has something to say. It would be interesting if an AI can also recognize intentions to speak. Especially in scenarios when an AI is guiding a group discussion, this can be a useful skill. This work studies the inference of successful and unsuccessful intentions to speak from accelerometer data. This is chosen because it is privacy-preserving and feasible for in-the-wild settings since it can be placed in a smart badge. Data from a real-life social networking event is used to train a machine-learning model that aims to infer intentions to speak. A subset of unsuccessful intention-to-speak cases in the data is annotated. The model is trained on the successful intentions to speak and evaluated on both the successful and unsuccessful cases. In conclusion, there is useful information in accelerometer data, but not enough to reliably capture intentions to speak. For example, posture shifts are correlated with intentions to speak, but people also often shift posture without having an intention to speak, or have an intention to speak without shifting their posture. More modalities are likely needed to reliably infer intentions to speak.

--------------------------------------------------------------------------------------------------------

Between Lines of Code: Unraveling the Distinct Patterns of Machine and Human Programmers

The paper by Shi et al. identifies distinct patterns in machine-generated code to detect its provenance. Detecting autogenerated code could help ensure software integrity and security.

Authors:  Yuling Shi, Hongyu Zhang, Chengcheng Wan, Xiaodong Gu

Link:  https://arxiv.org/abs/2401.06461v1

Date: 2024-01-12

Summary:

Large language models have catalyzed an unprecedented wave in code generation. While achieving significant advances, they blur the distinctions between machine-and human-authored source code, causing integrity and authenticity issues of software artifacts. Previous methods such as DetectGPT have proven effective in discerning machine-generated texts, but they do not identify and harness the unique patterns of machine-generated code. Thus, its applicability falters when applied to code. In this paper, we carefully study the specific patterns that characterize machine and human-authored code. Through a rigorous analysis of code attributes such as length, lexical diversity, and naturalness, we expose unique pat-terns inherent to each source. We particularly notice that the structural segmentation of code is a critical factor in identifying its provenance. Based on our findings, we propose a novel machine-generated code detection method called DetectCodeGPT, which improves DetectGPT by capturing the distinct structural patterns of code. Diverging from conventional techniques that depend on external LLMs for perturbations, DetectCodeGPT perturbs the code corpus by strategically inserting spaces and newlines, ensuring both efficacy and efficiency. Experiment results show that our approach significantly outperforms state-of-the-art techniques in detecting machine-generated code.

--------------------------------------------------------------------------------------------------------

MISS: A Generative Pretraining and Finetuning Approach for Med-VQA

The paper by Chen et al. treats medical visual QA as a generative task using self-supervised pretraining. Generative abilities could improve clinical applicability over classification models.

Authors:  Jiawei Chen, Dingkang Yang, Yue Jiang, Yuxuan Lei, Lihua Zhang

Link:  https://arxiv.org/abs/2401.05163v1

Date: 2024-01-10

Summary:

Medical visual question answering (VQA) is a challenging multimodal task, where Vision-Language Pre-training (VLP) models can effectively improve the generalization performance. However, most methods in the medical field treat VQA as an answer classification task which is difficult to transfer to practical application scenarios. Additionally, due to the privacy of medical images and the expensive annotation process, large-scale medical image-text pairs datasets for pretraining are severely lacking. In this paper, we propose a large-scale MultI-task Self-Supervised learning based framework (MISS) for medical VQA tasks. Unlike existing methods, we treat medical VQA as a generative task. We unify the text encoder and multimodal encoder and align image-text features through multi-task learning. Furthermore, we propose a Transfer-and-Caption method that extends the feature space of single-modal image datasets using large language models (LLMs), enabling those traditional medical vision field task data to be applied to VLP. Experiments show that our method achieves excellent results with fewer multimodal datasets and demonstrates the advantages of generative VQA models. The code and model weights will be released upon the paper's acceptance.

--------------------------------------------------------------------------------------------------------

Use of Graph Neural Networks in Aiding Defensive Cyber Operations

The paper by Mitra et al. reviews graph neural networks for enhancing cybersecurity defenses. GNNs show promise for integrating heterogeneous threat data to thwart attacks.

Authors:  Shaswata Mitra, Trisha Chakraborty, Subash Neupane, Aritran Piplai, Sudip Mittal

Link:  https://arxiv.org/abs/2401.05680v1

Date: 2024-01-11

Summary:

In an increasingly interconnected world, where information is the lifeblood of modern society, regular cyber-attacks sabotage the confidentiality, integrity, and availability of digital systems and information. Additionally, cyber-attacks differ depending on the objective and evolve rapidly to disguise defensive systems. However, a typical cyber-attack demonstrates a series of stages from attack initiation to final resolution, called an attack life cycle. These diverse characteristics and the relentless evolution of cyber attacks have led cyber defense to adopt modern approaches like Machine Learning to bolster defensive measures and break the attack life cycle. Among the adopted ML approaches, Graph Neural Networks have emerged as a promising approach for enhancing the effectiveness of defensive measures due to their ability to process and learn from heterogeneous cyber threat data. In this paper, we look into the application of GNNs in aiding to break each stage of one of the most renowned attack life cycles, the Lockheed Martin Cyber Kill Chain. We address each phase of CKC and discuss how GNNs contribute to preparing and preventing an attack from a defensive standpoint. Furthermore, We also discuss open research areas and further improvement scopes.

--------------------------------------------------------------------------------------------------------

Domain Adaptation for Time series Transformers using One-step fine-tuning

The paper by Khanal et al. proposes techniques to adapt time series transformers to new domains with limited data. Improved out-of-domain generalization could benefit many real-world applications.

Authors:  Subina Khanal, Seshu Tirupathi, Giulio Zizzo, Ambrish Rawat, Torben Bach Pedersen

Link:  https://arxiv.org/abs/2401.06524v1

Date: 2024-01-12

Summary:

The recent breakthrough of Transformers in deep learning has drawn significant attention of the time series community due to their ability to capture long-range dependencies. However, like other deep learning models, Transformers face limitations in time series prediction, including insufficient temporal understanding, generalization challenges, and data shift issues for the domains with limited data. Additionally, addressing the issue of catastrophic forgetting, where models forget previously learned information when exposed to new data, is another critical aspect that requires attention in enhancing the robustness of Transformers for time series tasks. To address these limitations, in this paper, we pre-train the time series Transformer model on a source domain with sufficient data and fine-tune it on the target domain with limited data. We introduce the \emph{One-step fine-tuning} approach, adding some percentage of source domain data to the target domains, providing the model with diverse time series instances. We then fine-tune the pre-trained model using a gradual unfreezing technique. This helps enhance the model's performance in time series prediction for domains with limited data. Extensive experimental results on two real-world datasets show that our approach improves over the state-of-the-art baselines by 4.35% and 11.54% for indoor temperature and wind power prediction, respectively.

--------------------------------------------------------------------------------------------------------

Boosting Mixed-Initiative Co-Creativity in Game Design: A Tutorial

The tutorial paper by Margarido et al. provides guidelines for developing video game design tools that enhance human-AI co-creativity. As AI capabilities advance, identifying techniques to boost collaborative mixed-initiative creativity will further propel innovation in interactive entertainment.

Authors:  Solange Margarido, Licínio Roque, Penousal Machado, Pedro Martins

Link:  https://arxiv.org/abs/2401.05999v1

Date: 2024-01-11

Summary:

In recent years, there has been a growing application of mixed-initiative co-creative approaches in the creation of video games. The rapid advances in the capabilities of artificial intelligence (AI) systems further propel creative collaboration between humans and computational agents. In this tutorial, we present guidelines for researchers and practitioners to develop game design tools with a high degree of mixed-initiative co-creativity (MI-CCy). We begin by reviewing a selection of current works that will serve as case studies and categorize them by the type of game content they address. We introduce the MI-CCy Quantifier, a framework that can be used by researchers and developers to assess co-creative tools on their level of MI-CCy through a visual scheme of quantifiable criteria scales. We demonstrate the usage of the MI-CCy Quantifier by applying it to the selected works. This analysis enabled us to discern prevalent patterns within these tools, as well as features that contribute to a higher level of MI-CCy. We highlight current gaps in MI-CCy approaches within game design, which we propose as pivotal aspects to tackle in the development of forthcoming approaches.

--------------------------------------------------------------------------------------------------------

Knowledge-Informed Machine Learning for Cancer Diagnosis and Prognosis: A review

The review paper by Mao et al. discusses integrating biomedical knowledge into machine learning models for cancer diagnosis and prognosis. Combining knowledge and data addresses challenges like limited samples and interpretability, and could ultimately improve model accuracy and clinical utility for this complex disease. Knowledge-informed techniques will be key to translating AI advancements into patient impact.

Authors:  Lingchao Mao, Hairong Wang, Leland S. Hu, Nhan L Tran, Peter D Canoll, Kristin R Swanson, Jing Li

Link:  https://arxiv.org/abs/2401.06406v1

Date: 2024-01-12

Summary:

Cancer remains one of the most challenging diseases to treat in the medical field. Machine learning has enabled in-depth analysis of rich multi-omics profiles and medical imaging for cancer diagnosis and prognosis. Despite these advancements, machine learning models face challenges stemming from limited labeled sample sizes, the intricate interplay of high-dimensionality data types, the inherent heterogeneity observed among patients and within tumors, and concerns about interpretability and consistency with existing biomedical knowledge. One approach to surmount these challenges is to integrate biomedical knowledge into data-driven models, which has proven potential to improve the accuracy, robustness, and interpretability of model results. Here, we review the state-of-the-art machine learning studies that adopted the fusion of biomedical knowledge and data, termed knowledge-informed machine learning, for cancer diagnosis and prognosis. Emphasizing the properties inherent in four primary data types including clinical, imaging, molecular, and treatment data, we highlight modeling considerations relevant to these contexts. We provide an overview of diverse forms of knowledge representation and current strategies of knowledge integration into machine learning pipelines with concrete examples. We conclude the review article by discussing future directions to advance cancer research through knowledge-informed machine learning.

--------------------------------------------------------------------------------------------------------


EYE ON A.I. GETS READERS UP TO DATE ON THE LATEST FUNDING NEWS AND RELATED ISSUES. SUBSCRIBE FOR THE WEEKLY NEWSLETTER.