Skip to content


books by b a r z i n from the Noun Project

Where possible, the bibliography entries use conventions at for citation keys. Journal abbreviations come from based on ISO 4 standards. Links to online versions of cited works use DOI for persistent identifiers. When available, open access URLs are listed.

– A –


"SpanMarker for Named Entity Recognition"
Tom Aarsen
Radboud University (2023-06-01)

A span-level Named Entity Recognition (NER) model that aims to improve performance while reducing computational requirements. SpanMarker leverages special marker tokens and utilizes BERT-style encoders with position IDs and attention mask matrices to capture contextual information effectively.


"DBpedia: A Nucleus for a Web of Open Data"
Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, Zachary Ives
ISWC (2007-11-11)

DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against datasets derived from Wikipedia and to link other datasets on the Web to Wikipedia data.

– B –


"Hinge-Loss Markov Random Fields and Probabilistic Soft Logic"
Stephen Bach, Matthias Broecheler, Bert Huang, Lise Getoor
JMLR (2017–11–17)

We introduce two new formalisms for modeling structured data, and show that they can both capture rich structure and scale to big data. The first, hinge-loss Markov random fields (HL-MRFs), is a new kind of probabilistic graphical model that generalizes different approaches to convex inference.


"Entities, Labels, and Surface Forms"
Caroline Barrière
Springer (2016-11-19)

We will look into a first obstacle toward this seemingly simple IE goal: the fact that entities do not have normalized names. Instead, entities can be referred to by many different surface forms.


"Topologies of Reasoning: Demystifying Chains, Trees, and Graphs of Thoughts"
Maciej Besta, Florim Memedi, Zhenyu Zhang, Robert Gerstenberger, Nils Blach, Piotr Nyczyk, Marcin Copik, Grzegorz Kwasniewski, Jurgen Müller, Lukas Gianinazzi, Ales Kubicek, Hubert Niewiadomski, Onur Mutlu, Torsten Hoefler ETH Zurich (2024-01-25)

Introducing a blueprint and an accompanying taxonomy of prompting schemes, focusing on the underlying structure of reasoning.

– C –


"REDFM: a Filtered and Multilingual Relation Extraction Dataset"
Pere-Lluís Huguet Cabot, Simone Tedeschi, Axel-Cyrille Ngonga Ngomo, Roberto Navigli
ACL (2023-06-19)

Relation Extraction (RE) is a task that identifies relationships between entities in a text, enabling the acquisition of relational facts and bridging the gap between natural language and structured knowledge. However, current RE models often rely on small datasets with low coverage of relation types, particularly when working with languages other than English. In this paper, we address the above issue and provide two new resources that enable the training and evaluation of multilingual RE systems.

– E –


"Introducing Wikidata to the Linked Data Web"
Fredo Erxleben, Michael Günther, Markus Krötzsch, Julian Mendez, Denny Vrandečić
ISWC (2014-10-19)

We introduce new RDF exports that connect Wikidata to the Linked Data Web. We explain the data model of Wikidata and discuss its encoding in RDF. Moreover, we introduce several partial exports that provide more selective or simplified views on the data.

– F –


"KÙZU Graph Database Management System"
Xiyang Feng, Guodong Jin, Ziyi Chen, Chang Liu, Semih Salihoğlu
CIDR (2023-01-08)

We present Kùzu, a new GDBMS we are developing at University of Waterloo that aims to integrate state-of-art storage, indexing, and query processing techniques to highly optimize for this feature set.

– G –


"Towards Foundation Models for Knowledge Graph Reasoning"
Mikhail Galkin, Xinyu Yuan, Hesham Mostafa, Jian Tang, Zhaocheng Zhu
preprint (2023–10–06)

ULTRA builds relational representations as a function conditioned on their interactions. Such a conditioning strategy allows a pre-trained ULTRA model to inductively generalize to any unseen KG with any relation vocabulary and to be fine-tuned on any graph.

– H –


"Exploring network structure, dynamics, and function using NetworkX"
Aric A. Hagberg, Daniel A. Schult, Pieter J. Swart
SciPy2008 (2008-08-19)

NetworkX is a Python language package for exploration and analysis of networks and network algorithms. The core package provides data structures for representing many types of networks, or graphs, including simple graphs, directed graphs, and graphs with parallel edges and self loops.


"Automatic generation of hypertext knowledge bases"
Udo Hahn, Ulrich Reimer
ACM SIGOIS 9:2 (1988-04-01)

The condensation process transforms the text representation structures resulting from the text parse into a more abstract thematic description of what the text is about, filtering out irrelevant knowledge structures and preserving only the most salient concepts.


Graph Representation Learning
William Hamilton
Morgan and Claypool (pre-print 2020)

A brief but comprehensive introduction to graph representation learning, including methods for embedding graph data, graph neural networks, and deep generative models of graphs.


"OpenNRE: An Open and Extensible Toolkit for Neural Relation Extraction"
Xu Han, Tianyu Gao, Yuan Yao, Deming Ye, Zhiyuan Liu, Maosong Sun
EMNLP (2019-11-03)

OpenNRE is an open-source and extensible toolkit that provides a unified framework to implement neural models for relation extraction (RE).


"Reconciliation of RDF* and Property Graphs"
Olaf Hartig
CoRR (2014-11-14)

The document proposes a formalization of the PG model and introduces well-defined transformations between PGs and RDF.


"spaCy: Industrial-strength Natural Language Processing in Python"
Matthew Honnibal, Ines Montani, Sofie Van Landeghem, Adriane Boyd
Explosion AI (2016-10-18)

spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products.

– L –


"InGram: Inductive Knowledge Graph Embedding via Relation Graphs"
Jaejun Lee, Chanyoung Chung, Joyce Jiyoung Whang
ICML (2023–08–17)

In this paper, we propose an INductive knowledge GRAph eMbedding method, InGram, that can generate embeddings of new relations as well as new entities at inference time.


"Barack's Wife Hillary: Using Knowledge-Graphs for Fact-Aware Language Modeling"
Robert L. Logan IV, Nelson F. Liu, Matthew E. Peters, Matt Gardner, Sameer Singh
ACL (2019-06-20)

We introduce the knowledge graph language model (KGLM), a neural language model with mechanisms for selecting and copying facts from a knowledge graph that are relevant to the context.

– M –


"Formalising openCypher Graph Queries in Relational Algebra"
József Marton, Gábor Szárnyas, Dániel Varró
ADBIS (2017-08-25)

We present a formal specification for openCypher, a high-level declarative graph query language with an ongoing standardisation effort.


"TextRank: Bringing Order into Text"
Rada Mihalcea, Paul Tarau
EMNLP pp. 404-411 (2004-07-25)

In this paper, the authors introduce TextRank, a graph-based ranking model for text processing, and show how this model can be successfully used in natural language applications.

– N –


"PyTextRank, a Python implementation of TextRank for phrase extraction and summarization of text documents"
Paco Nathan, et al.
Derwen (2016-10-03)

Python implementation of TextRank algorithms ("textgraphs") for phrase extraction


"Graph Levels of Detail"
Paco Nathan
Derwen (2023-11-12)

How can we work with graph data in more abstracted, aggregate perspectives? While we can run queries on graph data to compute aggregate measures, we don’t have programmatic means of “zooming out” to consider a large graph the way that one zooms out when using an online map.

- Q -


"Semantic Random Walk for Graph Representation Learning in Attributed Graphs"
Meng Qin
Hong Kong University of Science and Technology (2023-05-11)

We introduced a novel SGR method to generally formulate the network embedding in attributed graphs as a high-order proximity based embedding task of an auxilairy weighted graph with heterogeneous entities.


"IRWE: Inductive Random Walk for Joint Inference of Identity and Position Network Embedding"
Meng Qin, Dit-Yan Yeung
Hong Kong University of Science and Technology (2024-01-01)

Since nodes in a community should be densely connected, nodes within the same community are more likely to be reached via RWs compared with those in different communities. Therefore, nodes with similar positions (e.g., in the same community) are highly believed to have similar RW statistics.

- R -


"Random walks for text semantic similarity"
Daniel Ramage, Anna Rafferty, Christopher Manning
ACL-IJCNLP (2009-09-07)

Our algorithm aggregates local relatedness information via a random walk over a graph constructed from an underlying lexical resource. The stationary distribution of the graph walk forms a “semantic signature” that can be compared to another such distribution to get a relatedness score for texts.

– W –


"Natural Intelligence is All You Need™"
Vincent Warmerdam
PyData Amsterdam (2023-09-15)

In this talk I will try to show you what might happen if you allow yourself the creative freedom to rethink and reinvent common practices once in a while. As it turns out, in order to do that, natural intelligence is all you need. And we may start needing a lot of it in the near future.


"MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models"
Yilin Wen, Zifeng Wang, Jimeng Sun arXiv (2023-08-17)

We build a prompting pipeline that endows LLMs with the capability of comprehending KG inputs and inferring with a combined implicit knowledge and the retrieved external knowledge.


"Transformers: State-of-the-Art Natural Language Processing"
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, Alexander Rush
EMNLP (2020-11-16)

The library consists of carefully engineered state-of-the art Transformer architectures under a unified API. Backing this library is a curated collection of pretrained models made by and available for the community.