derwen.ai company logo
derwen.ai
—— jobs contact
discovery reports discussion events rates
kglab pytextrank chwedl —— GitHub Docker Hub Zenodo
about policy blog design links
rss_feed
  • https://creativecommons.org/licenses/by-sa/4.0/
  • https://twitter.com/pacoid
  • https://derwen.ai/
  • https://bit.ly/31waI51
  • https://en.wikipedia.org/wiki/Transformational_grammar
  • http://www.link.cs.cmu.edu/link/
  • http://mt-archive.info/NIST-2005-results.pdf
  • http://yosinski.com/mlss12/MLSS-2012-Domingos-Statistical-Relational-Learning/
  • https://colab.research.google.com/notebooks/basic_features_overview.ipynb
  • https://github.com/DerwenAI/spaCy_tuTorial
  • https://spacy.io/usage/spacy-101#whats-spacy
  • https://github.com/DerwenAI/spaCy_tuTorial/blob/master/spaCy_tuTorial.ipynb
  • https://colab.research.google.com/notebooks/welcome.ipynb#recent=true
  • https://nlp.stanford.edu/software/
  • https://nlp.johnsnowlabs.com/
  • https://research.zalando.com/welcome/mission/research-projects/flair-nlp/
  • https://spacy.io/usage/adding-languages
  • http://globalwordnet.org/wordnets-in-the-world/
  • https://wordnet.princeton.edu/related-projects
  • http://compling.hss.ntu.edu.sg/omw/
  • https://nlp.stanford.edu/software/lex-parser.shtml
  • https://docs.python.org/3.1/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8-bit
  • https://en.wikipedia.org/wiki/Typographic_ligature
  • https://docs.python.org/3/library/codecs.html
  • https://en.wikipedia.org/wiki/Unicode_equivalence
  • https://docs.python.org/3/library/unicodedata.html#unicodedata.normalize
  • https://queryunderstanding.com/character-filtering-76ede1cf1a97
  • https://queryunderstanding.com/archive
  • https://en.wikipedia.org/wiki/Jaccard_index#Tanimoto_similarity_and_distance
  • https://en.wikipedia.org/wiki/MinHash
  • https://learning.oreilly.com/oriole/probabilistic-data-structures-in-python
  • http://ekzhu.com/datasketch/
  • http://algo.inria.fr/flajolet/
  • https://spacy.io/usage/spacy-101#whats-spacy
  • https://github.com/DerwenAI/spaCy_tuTorial/blob/master/spaCy_tuTorial.ipynb
  • https://colab.research.google.com/notebooks/welcome.ipynb#recent=true
  • https://spacy.io/usage/spacy-101#whats-spacy
  • https://github.com/DerwenAI/spaCy_tuTorial/blob/master/spaCy_tuTorial.ipynb
  • https://colab.research.google.com/notebooks/welcome.ipynb#recent=true
  • https://github.com/Coleridge-Initiative/adrf-onto/wiki
  • https://www.w3.org/TR/vocab-dcat/
  • https://pav-ontology.github.io/pav/
  • https://sparontologies.github.io/cito/current/cito.html
  • https://sparontologies.github.io/fabio/current/fabio.html
  • http://xmlns.com/foaf/spec/
  • https://www.dublincore.org/specifications/dublin-core/dcmi-terms/
  • http://nlpprogress.com/english/entity_linking.html
  • https://github.com/NYU-CI/RCDatasets
  • https://github.com/Coleridge-Initiative/rclc/blob/master/corpus.ttl
  • https://coleridgeinitiative.org/richcontextcompetition
  • https://tinyurl.com/richcontextbook-update
  • https://github.com/Coleridge-Initiative/rclc
  • https://www.akbc.ws/2020/
  • https://sps.columbia.edu/academics/executive-education/programs-individuals/knowledge-graph-conference
  • https://connected-data.london/
  • https://irl.spacy.io/2019/
  • https://www.crummy.com/software/BeautifulSoup/
  • https://www.metachris.com/pdfx/
  • https://github.com/DerwenAI/spaCy_tuTorial/blob/master/Extract_Text_from_PDF.ipynb
  • https://colab.research.google.com/notebooks/welcome.ipynb#recent=true
  • https://pypi.org/project/PyPDF2/
  • https://github.com/euske/pdfminer
  • https://tika.apache.org/
  • https://github.com/DerwenAI/spaCy_tuTorial/blob/master/challenge.md
  • https://github.com/DerwenAI/pytextrank
  • https://explosion.ai/blog/spacy-v2-pipelines-extensions
  • https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf
  • https://web.eecs.umich.edu/~mihalcea/
  • https://www.cse.unt.edu/~tarau/
  • https://www.youtube.com/watch?v=NvpCFJ0dA8A
  • https://primer.ai/blog/a-new-state-of-the-art-for-named-entity-recognition/
  • https://nbviewer.jupyter.org/github/skipgram/modern-nlp-in-python/blob/master/executable/Modern_NLP_in_Python.ipynb
  • https://senzing.com/
  • https://learning.oreilly.com/videos/oreilly-artificial-intelligence/9781492050704/9781492050704-video328317
  • https://www.oreilly.com/radar/real-time-entity-resolution-made-accessible/
  • https://en.wikipedia.org/wiki/Long_short-term_memory
  • https://rossgoodwin.com/
  • https://narratedreality.com/
  • https://youtu.be/LY7x2Ihqjmc
  • https://goo.gl/Waw5Px
  • http://www.thereforefilms.com/films-by-benjamin-the-ai.html
  • https://botnik.org/
  • https://botnik.org/apps/writer/?source=ec7d245b56389ed750d786da3ea1c51b
  • https://towardsdatascience.com/romance-novels-generated-by-artificial-intelligence-1b31d9c872b2
  • https://allennlp.org/elmo
  • https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html
  • https://medium.com/syncedreview/baidus-ernie-tops-google-s-bert-in-chinese-nlp-tasks-d6a42b49223d
  • https://arxiv.org/pdf/1906.08237.pdf
  • https://openai.com/blog/gpt-2-6-month-follow-up/
  • https://ai.facebook.com/blog/roberta-an-optimized-method-for-pretraining-self-supervised-nlp-systems/
  • https://arxiv.org/abs/1910.01108
  • http://jalammar.github.io/illustrated-bert/
  • https://medium.com/huggingface/distilbert-8cf3380435b5
  • https://datascience.stackexchange.com/questions/14187/what-is-the-difference-between-model-hyperparameters-and-model-parameters
  • https://arxiv.org/abs/1910.01108v1
  • https://medium.com/syncedreview/hugging-face-implements-sota-transformer-architectures-for-pytorch-and-tensorflow-2-0-2e821dcb498d
  • https://arxiv.org/pdf/1602.01528.pdf
  • https://huggingface.co/transformers/
  • https://explosion.ai/blog/spacy-transformers
  • https://github.com/DerwenAI/spaCy_tuTorial/blob/master/spaCy_transformers_demo.ipynb
  • https://gluebenchmark.com/leaderboard
  • http://nlpprogress.com/
  • https://paperswithcode.com/sota
  • https://mccormickml.com/2019/07/22/BERT-fine-tuning/
  • https://github.com/DerwenAI/spaCy_tuTorial/blob/master/BERT_Fine_Tuning.ipynb
  • https://nyu-mll.github.io/CoLA/
  • https://gluebenchmark.com/leaderboard/submission/zlssuBTm5XRs0aSKbFYGVIVdvbj1/-LhijX9VVmvJcvzKymxy
  • http://aif360.mybluemix.net/
  • https://www.youtube.com/watch?v=a0bTPMvUJXI
  • https://www.youtube.com/watch?v=k-rcMjDWsNY
  • https://www.slideshare.net/AnimeshSingh/aif360-trusted-and-fair-ai
  • http://aif360.mybluemix.net/data
  • https://allennlp.org/interpret
  • https://demo.allennlp.org/masked-lm
  • https://arxiv.org/abs/1909.09251
  • https://github.com/allenai/allennlp-demo
  • https://github.com/jessevig/bertviz
  • https://github.com/hsm207/bert_attn_viz
  • https://aix360.mybluemix.net/
  • https://arxiv.org/abs/1901.11504
  • https://www.microsoft.com/en-us/research/blog/robust-language-representation-learning-via-multi-task-knowledge-distillation/
  • https://www.microsoft.com/en-us/research/publication/improving-multi-task-deep-neural-networks-via-knowledge-distillation-for-natural-language-understanding/
  • https://github.com/namisan/mt-dnn
  • https://blog.jupyter.org/jupytercon-2018-nyc-august-21-25-5571d7454d5b
  • https://drive.google.com/file/d/0By83v5TWkGjvQkpBcXJKT1I1TTA/view
  • http://jupyter.org/
  • https://arrow.apache.org/
  • http://ericjonas.com/project/numpywren/
  • https://rise.cs.berkeley.edu/projects/ray/
  • https://twitter.com/parente/status/1099725144048762885
  • https://www.wired.com/story/power-ai-startup-built-really-big-chip/
  • https://youtu.be/S9twUcX1Zp0
  • https://rise.cs.berkeley.edu/projects/ray/
  • https://www.mpi-forum.org/docs/
  • https://spark.apache.org/
  • https://www2.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf
  • http://mesos.apache.org/
  • http://spark.apache.org/
  • https://vimeo.com/3616394
  • https://rise.cs.berkeley.edu/blog/a-berkeley-view-on-serverless-computing/
  • https://blog.dominodatalab.com/themes-conferences-per-pacoid-episode-3/
  • https://conferences.oreilly.com/strata/strata-eu/public/schedule/detail/74145
  • https://youtu.be/wgJYd3iLPaU
  • https://arxiv.org/abs/1906.02243
  • https://arxiv.org/abs/1910.01155
  • https://www.sciencenews.org/article/rumors-hint-that-google-has-accomplished-quantum-supremacy
  • https://newsroom.ibm.com/2019-03-04-IBM-Achieves-Highest-Quantum-Volume-to-Date-Establishes-Roadmap-for-Reaching-Quantum-Advantage
  • https://derwen.ai/paco
  • https://twitter.com/pacoid
  • https://derwen.ai/
fullscreen content_copy help_outline 360 CC-BY-NC-SA-4.0
  • • Preamble
  • • Part 1: Parsing natural language
  • • Part 2: Intro to spaCy
  • • Part 3: Acquiring text
  • • Part 4: Understanding text
  • • Part 5: Examples
  • • Part 6: Transformers
  • • Part 7: Ethics and compliance
  • • Part 8: Hardware
  • • Part 9: Energy and policy considerations
  • • Outro
Paco Nathan
2019-11-19 11:02:09

An introduction to natural language work based on the spaCy library in Python.

Unless otherwise noted, all other works herein are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. For further details regarding marks, security, ethics, privacy, and other compliance matters, see our policy page and sitemap.