Paco Nathan
2021-01-30 05:38:00

Python offers excellent libraries for working with graphs: semantic technologies, graph queries, interactive visualizations, graph algorithms, probabilistic graph inference, as well as embedding and other integrations with deep learning. However, most of these approaches share little common ground, nor do many of them integrate effectively with popular data science tools (pandas, scikit-learn, spaCy, PyTorch), nor efficiently with popular data engineering infrastructure such as Spark, RAPIDS, Ray, Parquet, fsspec, etc. The `kglab` open source project integrates most all of the above, and moreover provides ways to leverage disparate techniques in ways that complement each other. This talk also explores _graph thinking_ as a cognitive framework for approaching complex problem spaces. This is the missing part between what the stakeholders, domain experts, and business use cases require – versus what comes from more "traditional" enterprise IT, which is probably focused on approaches such as "data lakehouse" or similar topics, but not doing much yet with large graphs.