Posts
2026
2022
- UIE - Universal Information ExtractionUIE - a unified text-to-structure generation framework, which can universally model different IE tasks, adaptively generate targeted structures, and unfiedly learn general IE abilities from different knowledge sources.
- Vector Search EngineVector search engines or vector databases are a core piece of infrastracture that fuels every big deep learning deployment in industry.
2021
2018
- ⛵ Learning Meaning in Natural Language Processing — The Semantics Mega-ThreadI found a very worthwhile article while surfing medium.com days ago. The article is a summary of a twitter thread which talked about meaning, semantics, language models, learning Thai and Java, entailment, co-reference — all in one fascinating thread. The original article is [here](https://medium.com/huggingface/learning-meaning-in-natural-language-processing-the-semantics-mega-thread-9c0332dfe28e).
- Filtered-Space Saving Top-K[Filtered-Space Saving](http://www.l2f.inesc-id.pt/~fmmb/wiki/uploads/Work/dict.refd.pdf) (FSS) is a data structure and algorithm combination useful for accurately estimating the top k most frequent values appearing in a stream while using a constant, minimal memory footprint.
- Gödel’s First Incompleteness Theorem for ProgrammersGödel's incompleteness theorems are two theorems of mathematical logic that demonstrate the inherent limitations of every formal axiomatic system containing basic arithmetic. These results, published by [Kurt Gödel](https://en.wikipedia.org/wiki/Kurt_G%C3%B6del) in 1931, are important both in mathematical logic and in the philosophy of mathematics.
- Weighted Random: algorithms for sampling from discrete probability distributionsThe optimal solution for weighted random should be the [Alias Method](https://en.wikipedia.org/wiki/Alias_method). It requires $O(n)$ time to initialize, $O(1)$ time to make a selection, and $O(n)$ memory.
2017
2015