Data-First Strategy

Consider the recent use of direct preference optimization (DPO) with open source tools such as Argilla and Distilabel to identify and fix data quality issues in the Zephyr-7B-beta dataset. This resulted in the Notus-7B-v1 model, which was created by a relatively small R&D team -- "GPU-poor" -- and then gained high ranking on the Hugging Face leaderboards.

Andrew Ng:

While it's always nice to have massive numbers of NVIDIA H100 or AMD MI300X GPUs, this work is another illustration — out of many, I want to emphasize — that deep thinking with only modest computational resources can carry you far.

"Direct Preference Optimization: Your Language Model is Secretly a Reward Model"
Rafael Rafailov, et al.

RE projects in particular tend to use Wikidata labels (not IRIs) to train models; these are descriptive but not computable

Components such as NER and RE could be enhanced by reworking the data quality for training data, benchmarks, evals, etc.

  • SpanMarker provides a framework for iteration on NER, to fine-tune for specific KGs

  • OpenNRE provides a framework for iteration on RE, to fine-tune for specific KGs

Data-first iterations on these components can take advantage of DPO, sparse fine-tuning, pruning, quantization, and so on, while the lemma graph plus its topological transforms provide enhanced tokenization and better context for training.