ACL2024: DEIE: Subtle Signatures, Strong Shields: Advancing Robust and Imperceptible Watermarking in Large Language Models

Yubing Ren, Ping Guo, Yanan Cao, Wei Ma

Abstract:The widespread adoption of Large Language Models (LLMs) has led to an increase in AI-generated text on the Internet, presenting a crucial challenge to differentiate AI-created content from human-written text. This challenge is critical to prevent issues of authenticity, trust, and potential copyright violations. Current research focuses on watermarking LLM-generated text, but traditional techniques struggle to balance robustness with text quality. We introduce a novel watermarking approach, Robust and Imperceptible Watermarking (RIW) for LLMs, which leverages token prior probabilities to improve detectability and maintain watermark imperceptibility. RIW methodically embeds watermarks by partitioning selected tokens into two distinct groups based on their prior probabilities and employing tailored strategies for each group. In the detection stage, the RIW method employs the ‘voted z-test’ to provide a statistically robust framework to identify the presence of a watermark accurately. The effectiveness of RIW is evaluated across three key dimensions: success rate, text quality, and robustness against removal attacks. Our experimental results on various LLMs, including GPT2-XL, OPT-1.3B, and LLaMA2-7B, indicate that RIW surpasses existing models, and also exhibits increased robustness against various attacks and good imperceptibility, thus promoting the responsible use of LLMs.

https://github.com/Lilice-r/RIW
NAACL2024: TISE: A Tripartite In-context Selection Method for Event Argument Extraction

Yanhe Fu, Yanan Cao∗, Qingyue Wang,and Yi Liu

Abstract:In-context learning enhances the reasoning capabilities of LLMs by providing several examples. A direct yet effective approach to obtain in-context example is to select the top k examples based on their semantic similarity to the test input. However, when applied to event argument extraction (EAE), this approach exhibits two shortcomings: 1) It may select almost identical examples, thus failing to provide additional event information, and 2) It overlooks event attributes, leading to the selected examples being unrelated to the test event type. In this paper, we introduce three necessary requirements when selecting an in-context example for EAE task: semantic similarity, example diversity and event correlation. And we further propose TISE, which scores examples from these three perspectives and integrates them using Determinantal Point Processes to directly select a set of examples as context. Experimental results on the ACE05 dataset demonstrate the effectiveness of TISE and the necessity of three requirements. Furthermore, we surprisingly observe that TISE can achieve superior performance with fewer examples and can even exceed some supervised methods.

COLING2024: DEIE: Benchmarking Document-level Event Information Extraction with a Large-scale Chinese News Dataset

Yubing Ren, Yanan Cao, Hao Li, Yingjie Li, Zixuan Ma, Fang Fang, Ping Guo and Wei Ma

Abstract:A text corpus centered on events is foundational to research concerning the detection, representation, reasoning, and harnessing of online events. The majority of current event-based datasets mainly target sentence-level tasks, thus to advance event-related research spanning from sentence to document level, this paper introduces DEIE, a unified large-scale document-level event information extraction dataset with over 56,000+ events and 242,000+ arguments. Three key features stand out: large-scale manual annotation (20,000 documents), comprehensive unified annotation (encompassing event trigger/argument, summary, and relation at once), and emergency events annotation (covering 19 emergency types). Notably, our experiments reveal that current event-related models struggle with DEIE, signaling a pressing need for more advanced event-related research in the future.

https://github.com/Lilice-r/DEIE
ICASSP2024: Sorting, Reasoning, and Extraction: an Easy-to-Hard Reasoning Framework for Document-level Event Argument Extraction

Hao Li, Yanan Cao, Yubing Ren, Fang Fang, Lanxue Zhang, Yingjie Li, Shi Wang

Abstract:Document-level event argument extraction is a crucial task to help understand event information. Existing methods mostly ignore the different extraction difficulties of arguments, and the lack of task planning significantly affects the extraction and reasoning abilities of the model. In this paper, we innovatively analyze the difficulty of arguments and propose a novel framework for reasoning from easy to hard, aiming to use the information of simple arguments to help the extraction of difficult arguments in a human-like way. Specifically, our framework consists of three core modules: sorting, reasoning, and extraction. The sorting module first sorts the argument roles according to the current context and plans the reasoning path from easy to hard. Then, the reasoning module performs information reasoning based on the reasoning path to help capture the information of difficult arguments. Finally, the extraction module utilizes the reasoning information to complete argument extraction. Experimental results on the RAMS and WikiEvents datasets show the great advantages of our proposed approach. In particular, we obtain new state-of-the-art (SOTA) performance in multiple scenarios.

https://github.com/hlee-top/event_extract
EMNLP2023: Intra-Event and Inter-Event Dependency-Aware Graph Network for Event Argument Extraction.

Hao Li, Yanan Cao, Yubing Ren, Fang Fang, Lanxue Zhang, Yingjie Li, Shi Wang

Abstract:Event argument extraction is critical to various natural language processing tasks for providing structured information. Existing works usually extract the event arguments one by one, and mostly neglect to build dependency information among event argument roles, especially from the perspective of event structure. Such an approach hinders the model from learning the interactions between different roles. In this paper, we raise our research question: How to adequately model dependencies between different roles for better performance? To this end, we propose an intra-event and inter-event dependency-aware graph network, which uses the event structure as the fundamental unit to construct dependencies between roles. Specifically, we first utilize the dense intra-event graph to construct role dependencies within events, and then construct dependencies between events by retrieving similar events of the current event through the retrieval module. To further optimize dependency information and event representation, we propose a dependency interaction module and two auxiliary tasks to improve the extraction ability of the model in different scenarios. Experimental results on the ACE05, RAMS, and WikiEvents datasets show the great advantages of our proposed approach.

NLPCC 2023: A Multi-granularity Similarity Enhanced Model for Implicit Event Argument Extraction.

Yanhe Fu, Yi Liu, Yanan Cao, Yubing Ren, Qingyue Wang, Fang Fang, Cong Cao

Abstract:Implicit Event Argument Extraction (Implicit EAE) aims to extract the document event arguments given the event type. Influenced by the document length, the arguments scattered in different sentences can potentially lead to two challenges during extraction: long-range dependency and distracting context. Existing works rely on the contextual capabilities of pre-trained models and semantic features but lack a straightforward solution for these two challenges and may introduce noise. In this paper, we propose a Multi-granularity Similarity Enhanced Model to solve these issues. Specifically, we first construct a heterogeneous graph to incorporate global information, then design a supplementary task to tackle the above challenges. For long-range dependency, span-level enhancement can directly close the semantic distance between trigger and arguments across sentences; for distracting context, sentence-level enhancement makes the model concentrate more on effective content. Experimental results on RAMS and WikiEvents demonstrate that our proposed model can obtain state-of-the-art performance in Implicit EAE.

ACL2023: Towards Better Entity Linking with Multi-View enhanced Distillation

Yi Liu, Yuan Tian, Jianxun Lian, Xinlong Wang, Yanan Cao, Fang Fang, Wen Zhang, Haizhen Huang, Denvy Deng, Qi Zhang

Abstract:Dense retrieval is widely used for entity linking to retrieve entities from large-scale knowledge bases. Mainstream techniques are based on a dual-encoder framework, which encodes mentions and entities independently and calculates their relevances via rough interaction metrics, resulting in difficulty in explicitly modeling multiple mention-relevant parts within entities to match divergent mentions. Aiming at learning entity representations that can match divergent mentions, this paper proposes a Multi-View Enhanced Distillation (MVD) framework, which can effectively transfer knowledge of multiple fine-grained and mention-relevant parts within entities from cross-encoders to dual-encoders. Each entity is split into multiple views to avoid irrelevant information being over-squashed into the mention-relevant view. We further design cross-alignment and self-alignment mechanisms for this framework to facilitate fine-grained knowledge distillation from the teacher model to the student model. Meanwhile, we reserve a global-view that embeds the entity as a whole to prevent dispersal of uniform information. Experiments show our method achieves state-of-the-art performance on several entity linking benchmarks.

https://github.com/Noen61/MVD
ACL2023: Retrieve-and-Sample: Document-level Event Argument Extraction via Hybrid Retrieval Augmentation.

Yubing Ren, Yanan Cao, Ping Guo, Fang Fang, Wei Ma, Zheng Lin

Abstract:Recent studies have shown the effectiveness of retrieval augmentation in many generative NLP tasks. These retrieval-augmented methods allow models to explicitly acquire prior external knowledge in a non-parametric manner and regard the retrieved reference instances as cues to augment text generation. These methods use similarity-based retrieval, which is based on a simple hypothesis: the more the retrieved demonstration resembles the original input, the more likely the demonstration label resembles the input label. However, due to the complexity of event labels and sparsity of event arguments, this hypothesis does not always hold in document-level EAE. This raises an interesting question: How do we design the retrieval strategy for document-level EAE? We investigate various retrieval settings from the input and label distribution views in this paper. We further augment document-level EAE with pseudo demonstrations sampled from event semantic regions that can cover adequate alternatives in the same context and event schema. Through extensive experiments on RAMS and WikiEvents, we demonstrate the validity of our newly introduced retrieval-augmented methods and analyze why they work.

https://github.com/Lilice-r/RAG_DEE
COLING2022: CLIO: Role-interactive Multi-event Head Attention Network for Document-level Event Extraction

Yubing Ren, Yanan Cao, Fang Fang, Ping Guo, Zheng Lin, Wei Ma, Yi Liu.

Abstract:Transforming the large amounts of unstructured text on the Internet into structured event knowledge is a critical, yet unsolved goal of NLP, especially when addressing document-level text. Existing methods struggle in Document-level Event Extraction (DEE) due to its two intrinsic challenges: (a) Nested arguments, which means one argument is the sub-string of another one. (b) Multiple events, which indicates we should identify multiple events and assemble the arguments for them. In this paper, we propose a role-interactive multi-event head attention network (CLIO) to solve these two challenges jointly. The key idea is to map different events to multiple subspaces (i.e. multi-event head). In each event subspace, we draw the semantic representation of each role closer to its corresponding arguments, then we determine whether the current event exists. To further optimize event representation, we propose an event representation enhancing strategy to regularize pre-trained embedding space to be more isotropic. Our experiments on two widely used DEE datasets show that CLIO achieves consistent improvements over previous methods.

https://github.com/Lilice-r/CLIO
EMNLP2021: TEBNER: Domain Specific Named Entity Recognition with Type Expanded Boundary-aware Network

Zheng Fang, Yanan Cao, Tai Li, Ruipeng Jia, Fang Fang, Yanmin Shang, Yuhai Lu

Abstract:To alleviate label scarcity in Named Entity Recognition (NER) task, distantly supervised NER methods are widely applied to automatically label data and identify entities. Although the human effort is reduced, the generated incomplete and noisy annotations pose new challenges for learning effective neural models. In this paper, we propose a novel dictionary extension method which extracts new entities through the type expanded model. Moreover, we design a multi-granularity boundary-aware network which detects entity boundaries from both local and global perspectives. We conduct experiments on different types of datasets, the results show that our model outperforms previous state-of-the-art distantly supervised systems and even surpasses the supervised models.

https://github.com/fangzheng123/TEBNER
WWW2020: High Quality Candidate Generation and Sequential Graph Attention Network for Entity Linking

Zheng Fang, Yanan Cao, Ren Li,Zhenyu Zhang, Yanbing Liu, Shi Wang

Abstract:Entity Linking (EL) is a task for mapping mentions in text to corresponding entities in knowledge base (KB). This task usually includes candidate generation (CG) and entity disambiguation (ED) stages. Recent EL systems based on neural network models have achieved good performance, but they still face two challenges: (i) Previous studies evaluate their models without considering the differences between candidate entities. In fact, the quality (gold recall in particular) of candidate sets has an effect on the EL results. So, how to promote the quality of candidates needs more attention. (ii) In order to utilize the topical coherence among the referred entities, many graph and sequence models are proposed for collective ED. However, graph-based models treat all candidate entities equally which may introduce much noise information. On the contrary, sequence models can only observe previous referred entities, ignoring the relevance between the current mention and its subsequent entities. To address the first problem, we propose a multi-strategy based CG method to generate high recall candidate sets. For the second problem, we design a Sequential Graph Attention Network (SeqGAT) which combines the advantages of graph and sequence methods. In our model, mentions are dealt with in a sequence manner. Given the current mention, SeqGAT dynamically encodes both its previous referred entities and subsequent ones, and assign different importance to these entities. In this way, it not only makes full use of the topical consistency, but also reduce noise interference. We conduct experiments on different types of datasets and compare our method with previous EL system on the open evaluation platform. The comparison results show that our model achieves significant improvements over the state-of-the-art methods.

https://github.com/fangzheng123/SGEL
WWW2019: Joint Entity Linking with Deep Reinforcement Learning

Zheng Fang, Yanan Cao, Qian Li, Dongjie Zhang, Zhenyu Zhang, Yanbing Liu

Abstract:Entity linking is the task of aligning mentions to corresponding entities in a given knowledge base. Previous studies have highlighted the necessity for entity linking systems to capture the global coherence. However, there are two common weaknesses in previous global models. First, most of them calculate the pairwise scores between all candidate entities and select the most relevant group of entities as the final result. In this process, the consistency among wrong entities as well as that among right ones are involved, which may introduce noise data and increase the model complexity. Second, the cues of previously disambiguated entities, which could contribute to the disambiguation of the subsequent mentions, are usually ignored by previous models. To address these problems, we convert the global linking into a sequence decision problem and propose a reinforcement learning model which makes decisions from a global perspective. Our model makes full use of the previous referred entities and explores the long-term influence of current selection on subsequent decisions. We conduct experiments on different types of datasets, the results show that our model outperforms state-of-the-art systems and has better generalization performance.

https://github.com/fangzheng123/RLEL_2019