Paco Nathan
Known as a "player/coach", with core expertise in data science, natural language, machine learning, cloud computing; 38+ years tech industry experience, ranging from Bell Labs to early-stage start-ups. Advisor for Amplify Partners, IBM Data Science Community, Recognai, KUNGFU.AI, Primer. Lead committer PyTextRank. Formerly: Director, Community Evangelism @ Databricks and Apache Spark. Cited in 2015 as one of the Top 30 People in Big Data and Analytics by Innovation Enterprise.

GPG key icon0EEC 171D 3A38 7943 9E2E F23D 157E FBCA 16E9 2CF6
ORCID iD iconorcid.org/0000-0003-3167-1539

  • PyTextRank, PyPi (2017-03-13): Python implementation of TextRank algorithm by Mihalcea, et al., implemented as a spaCy pipeline extension
  • spaCy tuTorial, GitHub (2019-09-04): A brief tutorial for spaCy 2.x, which runs on Google Colab
  • Anyscale Academy, GitHub (2020-06-14): Reinforcement learning tutorials for Ray RLlib, with Dean Wampler
  • richcontext.scholapi, GitHub (2019-11-23): API integrations for federating discovery services and metadata exchange across multiple scholarly infrastructure providers, with Ernesto Gimeno, Erik Lopez, Sophie Rand, Ian Mulvany
  • disparity filter, GitHub (2018-11-16): Implements a disparity filter in Python, based on graphs in NetworkX, to extract the multiscale backbone of a complex weighted network
  • JupyterLab Metadata Extension, GitHub (2019-10-22): explores linked data for resources (datasets) in JuyterLab
  • Exelixi, GitHub (2013): a pure Python framework based on Apache Mesos, for GA/GP at scale
  • Cascading.Pattern, GitHub (2013): machine learning libraries for Cascading, translating PMML to Hadoop apps
  • Cascading for the Impatient, GitHub (2013): a progressive series of introductory sample apps for Cascading
  • City of Palo Alto Open Data, GitHub (2012): Cascading and Cascalog recommender apps based on CoPA Open Data
  • Cascading.SampleRecommender, GitHub (2012): an example social recommender, based on stock Tweets
  • Getting Started on Hadoop, GitHub (2010): Hadoop streaming in Python, to analyze Enron email data set
  • TextRank, GitHub (2009): Java implementation of TextRank algorithm by Mihalcea, et al.
  • OpenSIMS, SourceForge (2004): integrates OSS tools for security management plus real-time analysis visualization
  • JFRED Chatbot Server, Robitron (1996): chatbot platform used for FringeWare eCRM, for BBC live televised Turing Test, etc.