Paco Nathan, Managing Partner at Derwen, Inc.
Known as a "player/coach", with core expertise in graph technologies, natural language, data science, cloud computing; ~40 years tech industry experience, ranging from Bell Labs to early-stage start-ups. Board member for Argilla.io; Advisor for KUNGFU.AI. Lead committer on PyTextRank, kglab. Formerly: Director, Community Evangelism for Apache Spark at Databricks.

GPG key icon0EEC 171D 3A38 7943 9E2E F23D 157E FBCA 16E9 2CF6
ORCID iD iconorcid.org/0000-0003-3167-1539
member: ACM, PSF, NumFOCUS, ESIP

  • arXiv trends, GitHub (2021-06-16): Analyze trends in articles published on arXiv using NLP, knowledge graph, and time series
  • MkRefs, GitHub (2021-05-10): MkDocs plugin to generate semantic reference Markdown pages from a knowledge graph
  • Ray tutorials, GitHub (2021-02-23): An introductory tutorial about leveraging Ray core features for distributed patterns
  • kglab, PyPi (2020-11-09): A simple abstraction layer in Python for building knowledge graphs
  • RLlib tutorials, GitHub (2020-01-29): RLlib tutorials, including Market Bandit and RL RecSys
  • PyTextRank, PyPi (2016-02-02): Python implementation of TextRank algorithm by Mihalcea, et al., implemented as a spaCy pipeline extension
  • disparity filter, GitHub (2018-11-16): Implements a disparity filter in Python, based on graphs in NetworkX, to extract the multiscale backbone of a complex weighted network
  • spaCy tuTorial, GitHub (2019-09-04): A brief tutorial for spaCy 2.x, which runs on Google Colab
  • Exsto, GitHub (2015-07-15): analyze the structure and dynamics of an open source project's developer community using NLP, graph algorithms, etc.
  • Ray core tutorial, GitHub (2021-02-23): An introductory tutorial about leveraging Ray core features for distributed patterns.
  • gym_example, GitHub (2020-07-13): An example implementation of an OpenAI Gym environment used for a Ray RLlib tutorial
  • Anyscale Academy, GitHub (2020-06-14): Reinforcement learning tutorials for Ray RLlib, with Dean Wampler
  • JupyterLab Metadata Extension, GitHub (2019-10-22): explores linked data for resources (datasets) in JuyterLab
  • richcontext.scholapi, GitHub (2019-11-23): API integrations for federating discovery services and metadata exchange across multiple scholarly infrastructure providers, with Ernesto Gimeno, Erik Lopez, Sophie Rand, Ian Mulvany
  • Exelixi, GitHub (2013): a pure Python framework based on Apache Mesos, for GA/GP at scale
  • Cascading.Pattern, GitHub (2013): machine learning libraries for Cascading, translating PMML to Hadoop apps
  • City of Palo Alto Open Data, GitHub (2012): Cascading and Cascalog recommender apps based on CoPA Open Data
  • Cascading for the Impatient, GitHub (2013): a progressive series of introductory sample apps for Cascading
  • Cascading.SampleRecommender, GitHub (2012): an example social recommender, based on stock Tweets
  • Getting Started on Hadoop, GitHub (2010): Hadoop streaming in Python, to analyze Enron email data set
  • TextRank, GitHub (2009): Java implementation of TextRank algorithm by Mihalcea, et al.
  • OpenSIMS, SourceForge (2004): integrates OSS tools for security management plus real-time analysis visualization
  • JFRED Chatbot Server, Robitron (1996): chatbot platform used for FringeWare eCRM, for BBC live televised Turing Test, etc.