Projects

pullword: Unsupervised Word Discovery

Pullword would be used in settings where only unlabelled text data is available.

FMR: functional meaning representation

A formal language for representing meaning and a system for semantic parsing.

gocc: Golang version OpenCC 繁簡轉換

gocc is a golang port of OpenCC(Open Chinese Convert 開放中文轉換) which is a project for conversion between Traditional and Simplified Chinese developed by BYVoid.

ling: A Natural Language Processing toolkit in Golang

Toknization, Normalization, Lemmatization, Tagging etc.

Goobot: A general multilingual web article extractor

Goobot is a general multilingual web article extractor. It works without rules or training just as diffbot.com, and it is more than 10 times faster than diffbot.

Knowledge extraction from web pages

基于网页库的全球电话号码信息抽取

Web Crawling at Scale

A flexible and high performance distributed crawler framework.

Recent Posts

More Posts

I found a very worthwhile article while surfing medium.com days ago. The article is a summary of a twitter thread which talked about meaning, semantics, language models, learning Thai and Java, entailment, co-reference — all in one fascinating thread. The original article is here.

CONTINUE READING

Filtered-Space Saving (FSS) is a data structure and algorithm combination useful for accurately estimating the top k most frequent values appearing in a stream while using a constant, minimal memory footprint.

CONTINUE READING

Gödel’s incompleteness theorems are two theorems of mathematical logic that demonstrate the inherent limitations of every formal axiomatic system containing basic arithmetic. These results, published by Kurt Gödel in 1931, are important both in mathematical logic and in the philosophy of mathematics.

CONTINUE READING

The optimal solution for weighted random should be the Alias Method. It requires $O(n)$ time to initialize, $O(1)$ time to make a selection, and $O(n)$ memory.

CONTINUE READING

A Python implementation for illustrating the behavior of logistic map.

CONTINUE READING

Recent & Upcoming Talks

人工智能的过去、现在和未来
Jun 17, 2017 2:00 PM
『大数据』方法论及示例
Nov 26, 2015 9:00 AM

Contact

  • liang@zliu.org
  • Baidu Technology Park, No.10 Xibeiwang East Road, Haidian District, Beijing, China