Skip to content


– A –

abstractive summarization

Generating a short, concise summary which captures salient ideas of the source text, potentially using new phrases and sentences that may not appear in the source.

– C –


See also: cloud computing

cloud computing

The on-demand availability of computing resources over the Internet without direct active management by the user, often paid for on a short-term basis, giving the illusion of infinite computing resources available and thereby eliminating the need to plan far ahead for provisioning.


computable content

Interactive learning materials which leverage remote computation, e.g., Jupyter notebooks.


coreference resolution

Clustering mentions within a text that refer to the same underlying entities.


– D –


See also: deep learning

data science

An interdisciplinary field which emerged from industry not academia, focused on deriving insights from data, emphasizing how to leverage curiosity and domain expertise, and applying increasingly advanced mathematics for novel business cases in response to surges in data rates and compute resources.


data strategy

The tools, processes, and practices that define how to manage and leverage data to make informed decisions.

deep learning

A family of machine learning methods based on artificial neural networks which use representation learning.


– E –

eigenvector centrality

Measuring the influence of a node within a network.


entity linking

Recognizing named entities within a text, then disambiguating them by linking to specific contexts in a knowledge graph.



extractive summarization

Summarizing the source text by identifying a subset of the sentences as the most important excerpts, then generating a sequence of them verbatim.

– G –

graph algorithms

A family of algorithms that operation on graphs for network analysis, measurement, ranking, partitioning, and other methods that leverage graph theory.


– K –


See also: knowledge graph


See also: knowledge graph conference

knowledge graph

A knowledge base that uses a graph-structured data model, representing and annotating interlinked descriptions of entities, with an overlay of semantic metadata.


knowledge graph conference

Founded in 2019 at Columbia University, The Knowledge Graphs Conference is emerging as the premiere source of learning around knowledge graph technologies. We believe knowledge graphs are an underutilized yet essential force for solving complex societal challenges like climate change, democratizing access to knowledge and opportunity, and capturing business value made possible by the AI revolution.

– L –

language model

A statistical model used for predicting the next word or character within a document.



lemma graph

A graph data structure used to represent links among phrase extracted from a source text, during the operation of the TextRank algorithm.

Described in: [mihalcea04textrank]

– N –


See also: named entity recognition


See also: natural language

named entity recognition

Extracting mentions of named entities from unstructured text, then annotating them with pre-defined categories.


natural language

Intersection of computer science and linguistics, used to leverage data in the form of text, speech, and images to identify structure and meaning. Also used for enabling people and computer-based agents to interact using natural language.


– P –

personalized pagerank

Using the personalized teleportation behaviors originally described for the PageRank algorithm to focus ranked results within a neighborhood of the graph, given a set of nodes as input.

Described in: [page1998], [gleich15]

phrase extraction

Selecting representative phrases from a document as its characteristic entities; in contrast to keyword analysis.

– R –


See also: reinforcement learning

reinforcement learning

Optimal control theory mixed with deep learning where software agents learn to take actions within an environment and make sequences of decisions to maximize a cumulative reward -- typically stated in terms of markov decision process -- finding a balance between exploration (uncharted territory) and exploitation (current knowledge). Generally a reverse engineering of various psychological learning processes.


– S –

semantic relations

Associations that exist between the meanings of phrases.

stop words

Words to be filtered out during natural language processing.



Producing a shorter version of one or more documents, while preserving most of the input's meaning.


– T –

text summarization

See also: summarization


Use of graph algorithms for NLP, based on a graph representation of a source text.




A family of deep learning models, mostly used in NLP, which adopts the mechanism of attention to weigh the influence of different parts of the input data.



Last update: 2021-06-29