Paco Nathan, Managing Partner at Derwen, Inc.
Known as a "player/coach", with core expertise in data science, cloud computing, natural language, graph technologies; ~40 years tech industry experience, ranging from Bell Labs to early-stage start-ups. Advisor for Amplify Partners, Recognai, KUNGFU.AI. Lead committer PyTextRank, kglab. Formerly: Director, Community Evangelism @ Databricks and Apache Spark. Cited in 2015 as one of the Top 30 People in Big Data and Analytics by Innovation Enterprise.

GPG key icon0EEC 171D 3A38 7943 9E2E F23D 157E FBCA 16E9 2CF6
member: ACM, PSF, NumFOCUS, ESIP

  • arXiv trends, GitHub (2021-06-16): Analyze trends in articles published on arXiv using NLP, knowledge graph, and time series
  • MkRefs, GitHub (2021-05-10): MkDocs plugin to generate semantic reference Markdown pages from a knowledge graph
  • kglab, PyPi (2020-11-09): A simple abstraction layer in Python for building knowledge graphs
  • RLlib tutorials, GitHub (2020-01-29): RLlib tutorials, including Market Bandit and RL RecSys
  • PyTextRank, PyPi (2016-02-02): Python implementation of TextRank algorithm by Mihalcea, et al., implemented as a spaCy pipeline extension
  • disparity filter, GitHub (2018-11-16): Implements a disparity filter in Python, based on graphs in NetworkX, to extract the multiscale backbone of a complex weighted network
  • spaCy tuTorial, GitHub (2019-09-04): A brief tutorial for spaCy 2.x, which runs on Google Colab
  • Exsto, GitHub (2015-07-15): analyze the structure and dynamics of an open source project's developer community using NLP, graph algorithms, etc.
  • Ray core tutorial, GitHub (2021-02-23): An introductory tutorial about leveraging Ray core features for distributed patterns.
  • gym_example, GitHub (2020-07-13): An example implementation of an OpenAI Gym environment used for a Ray RLlib tutorial
  • Anyscale Academy, GitHub (2020-06-14): Reinforcement learning tutorials for Ray RLlib, with Dean Wampler
  • JupyterLab Metadata Extension, GitHub (2019-10-22): explores linked data for resources (datasets) in JuyterLab
  • richcontext.scholapi, GitHub (2019-11-23): API integrations for federating discovery services and metadata exchange across multiple scholarly infrastructure providers, with Ernesto Gimeno, Erik Lopez, Sophie Rand, Ian Mulvany
  • Exelixi, GitHub (2013): a pure Python framework based on Apache Mesos, for GA/GP at scale
  • Cascading.Pattern, GitHub (2013): machine learning libraries for Cascading, translating PMML to Hadoop apps
  • City of Palo Alto Open Data, GitHub (2012): Cascading and Cascalog recommender apps based on CoPA Open Data
  • Cascading for the Impatient, GitHub (2013): a progressive series of introductory sample apps for Cascading
  • Cascading.SampleRecommender, GitHub (2012): an example social recommender, based on stock Tweets
  • Getting Started on Hadoop, GitHub (2010): Hadoop streaming in Python, to analyze Enron email data set
  • TextRank, GitHub (2009): Java implementation of TextRank algorithm by Mihalcea, et al.
  • OpenSIMS, SourceForge (2004): integrates OSS tools for security management plus real-time analysis visualization
  • JFRED Chatbot Server, Robitron (1996): chatbot platform used for FringeWare eCRM, for BBC live televised Turing Test, etc.