Skip to content


Community by Aneeque Ahmed from the Noun Project


Many thanks to our open source sponsors; and to our contributors: @dvsrepo, @Ankush-Chander, @louisguitton, @tomaarsen, @Mec-iS, @cutterkom, @RishiKumarRay, @ArenasGuerreroJulian, @fils, @gauravjaglan, @pebbie, @CatChenal, @jake-aft, @dmoore247; plus general support from Derwen, Inc.; the Knowledge Graph Conference and Connected Data World plus an even larger scope of use cases represented by their communities; Kubuntu Focus, the RAPIDS team @ NVIDIA, Gradient Flow, and Manning Publications.

Project Lead

Paco Nathan is lead committer on kglab and lead author for its documentation and tutorial. By day he's the Managing Partner at Derwen, Inc. Paco's formal background is in Mathematics (advisor: Richard Cottle) and Computer Science (advisor: Douglas Lenat), with additional work in Design and Linguistics. His business experience includes: Director, VP, and CTO positions leading data teams and machine learning projects; former CTO/Board member at two publicly-traded tech firms on NASDAQ OTC:BB; and an equity partner at Amplify Partners. Cited in 2015 as one of the Top 30 People in Big Data and Analytics by Innovation Enterprise.

  • ~40 years tech industry experience, ranging from Bell Labs to early-stage start-ups
  • 7+ years R&D in neural networks (incl. h/w accelerators) during 1980-90s
  • early "guinea pig" for Amazon AWS (2006), who led the first large-scale Hadoop use case on cloud computing (2008)
  • former Director, Community Evangelism at Databricks (2014-2015) for Apache Spark
  • lead committer on PyTextRank (spaCy pipeline); open source community work on Jupyter, Ray, Cascading
  • consultant to enterprise organizations for data strategy; advisor to several AI start-ups, including Recognai, KUNGFU.AI

As an author/speaker/instructor, Paco has taught many people (+9000) in industry across a range of topics – data science, natural language, cloud computing, computable content, etc. – and through guest lectures at Stanford, CMU, UC Berkeley, U da Coruña, U Manchester, KTH, NYU, GWU, U Maryland, Cal Poly, UT/Austin, Northeastern, U Virginia, CU Boulder.

profile for Paco at Stack Overflow, Q&A for professional and enthusiast programmers


Please use the following BibTeX entry for citing kglab if you use it in your research or software:

  author = {Paco Nathan},
  title = {{kglab: a simple abstraction layer in Python for building knowledge graphs}},
  year = 2020,
  publisher = {Derwen},
  doi = {10.5281/zenodo.6360664},
  url = {}


Source code for kglab plus its logo, documentation, and examples have an MIT license which is succinct and simplifies use in commercial applications.

All materials herein are Copyright © 2020-2022 Derwen, Inc.

logo for Derwen, Inc.

Production Use Cases

  • Derwen and its client projects

Similar Projects

See also:

  • PheKnowLator
    • pro: quite similar to kglab in intent; well-written code; sophisticated, opinionate build of biomedical KGs
    • con: less integration with data science tools or distributed systems
  • GraphScope
    • pro: loads of features, excellent support, broad adoption
    • con: less of a library more of a client/server architecture; aims to reinvent instead of integrating
  • LynxKite
    • pro: loads of features, lots of adoption
    • con: complex tech stack, combines Py/Java/Go; AGPL less-than-business-friendly for production apps
  • KGTK
    • pro: many excellent examples, well-documented in Jupyter notebooks
    • con: mostly a CLI tool, primarily based on TSV data
  • zincbase
    • pro: probabilistic graph measures, complex simulation suite, leverages GPUs
    • con: lacks interchange with RDF or other standard formats

In general, check for excellent curated listings of open source semantic technologies in Python.

Last update: 2022-03-23