Skip to content

Commit 65a8687

Browse files
committed
update README
1 parent 7c3b5ce commit 65a8687

File tree

1 file changed

+21
-22
lines changed

1 file changed

+21
-22
lines changed

README.md

Lines changed: 21 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,7 @@
1-
Wikipedia2Vec
2-
=============
1+
# Wikipedia2Vec
32

4-
[![Fury badge](https://badge.fury.io/py/wikipedia2vec.png)](http://badge.fury.io/py/wikipedia2vec)
5-
[![CircleCI](https://circleci.com/gh/wikipedia2vec/wikipedia2vec.svg?style=svg)](https://circleci.com/gh/wikipedia2vec/wikipedia2vec)
3+
[![tests](https://github.com/wikipedia2vec/wikipedia2vec/actions/workflows/test.yml/badge.svg?branch=master)](https://github.com/wikipedia2vec/wikipedia2vec/actions/workflows/test.yml)
4+
[![pypi Version](https://img.shields.io/pypi/v/wikipedia2vec.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.org/project/wikipedia2vec/)
65

76
Wikipedia2Vec is a tool used for obtaining embeddings (or vector representations) of words and entities (i.e., concepts that have corresponding pages in Wikipedia) from Wikipedia.
87
It is developed and maintained by [Studio Ousia](http://www.ousia.jp).
@@ -14,7 +13,7 @@ This tool implements the [conventional skip-gram model](https://en.wikipedia.org
1413

1514
An empirical comparison between Wikipedia2Vec and existing embedding tools (i.e., FastText, Gensim, RDF2Vec, and Wiki2vec) is available [here](https://arxiv.org/abs/1812.06280).
1615

17-
Documentation are available online at [http://wikipedia2vec.github.io/](http://wikipedia2vec.github.io/).
16+
Documentation are available online at [http://wikipedia2vec.github.io/](http://wikipedia2vec.github.io/).
1817

1918
## Basic Usage
2019

@@ -24,15 +23,15 @@ Wikipedia2Vec can be installed via PyPI:
2423
% pip install wikipedia2vec
2524
```
2625

27-
With this tool, embeddings can be learned by running a *train* command with a Wikipedia dump as input.
26+
With this tool, embeddings can be learned by running a _train_ command with a Wikipedia dump as input.
2827
For example, the following commands download the latest English Wikipedia dump and learn embeddings from this dump:
2928

3029
```bash
3130
% wget https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2
3231
% wikipedia2vec train enwiki-latest-pages-articles.xml.bz2 MODEL_FILE
3332
```
3433

35-
Then, the learned embeddings are written to *MODEL\_FILE*.
34+
Then, the learned embeddings are written to _MODEL_FILE_.
3635
Note that this command can take many optional parameters.
3736
Please refer to [our documentation](https://wikipedia2vec.github.io/wikipedia2vec/commands/) for further details.
3837

@@ -44,21 +43,21 @@ Pretrained embeddings for 12 languages (i.e., English, Arabic, Chinese, Dutch, F
4443

4544
Wikipedia2Vec has been applied to the following tasks:
4645

47-
* Entity linking: [Yamada et al., 2016](https://arxiv.org/abs/1601.01343), [Eshel et al., 2017](https://arxiv.org/abs/1706.09147), [Chen et al., 2019](https://arxiv.org/abs/1911.03834), [Poerner et al., 2020](https://arxiv.org/abs/1911.03681), [van Hulst et al., 2020](https://arxiv.org/abs/2006.01969).
48-
* Named entity recognition: [Sato et al., 2017](http://www.aclweb.org/anthology/I17-2017), [Lara-Clares and Garcia-Serrano, 2019](http://ceur-ws.org/Vol-2421/eHealth-KD_paper_6.pdf).
49-
* Question answering: [Yamada et al., 2017](https://arxiv.org/abs/1803.08652), [Poerner et al., 2020](https://arxiv.org/abs/1911.03681).
50-
* Entity typing: [Yamada et al., 2018](https://arxiv.org/abs/1806.02960).
51-
* Text classification: [Yamada et al., 2018](https://arxiv.org/abs/1806.02960), [Yamada and Shindo, 2019](https://arxiv.org/abs/1909.01259), [Alam et al., 2020](https://link.springer.com/chapter/10.1007/978-3-030-61244-3_9).
52-
* Relation classification: [Poerner et al., 2020](https://arxiv.org/abs/1911.03681).
53-
* Paraphrase detection: [Duong et al., 2018](https://ieeexplore.ieee.org/abstract/document/8606845).
54-
* Knowledge graph completion: [Shah et al., 2019](https://aaai.org/ojs/index.php/AAAI/article/view/4162), [Shah et al., 2020](https://www.aclweb.org/anthology/2020.textgraphs-1.9/).
55-
* Fake news detection: [Singh et al., 2019](https://arxiv.org/abs/1906.11126), [Ghosal et al., 2020](https://arxiv.org/abs/2010.10836).
56-
* Plot analysis of movies: [Papalampidi et al., 2019](https://arxiv.org/abs/1908.10328).
57-
* Novel entity discovery: [Zhang et al., 2020](https://arxiv.org/abs/2002.00206).
58-
* Entity retrieval: [Gerritse et al., 2020](https://link.springer.com/chapter/10.1007%2F978-3-030-45439-5_7).
59-
* Deepfake detection: [Zhong et al., 2020](https://arxiv.org/abs/2010.07475).
60-
* Conversational information seeking: [Rodriguez et al., 2020](https://arxiv.org/abs/2005.00172).
61-
* Query expansion: [Rosin et al., 2020](https://arxiv.org/abs/2012.12065).
46+
- Entity linking: [Yamada et al., 2016](https://arxiv.org/abs/1601.01343), [Eshel et al., 2017](https://arxiv.org/abs/1706.09147), [Chen et al., 2019](https://arxiv.org/abs/1911.03834), [Poerner et al., 2020](https://arxiv.org/abs/1911.03681), [van Hulst et al., 2020](https://arxiv.org/abs/2006.01969).
47+
- Named entity recognition: [Sato et al., 2017](http://www.aclweb.org/anthology/I17-2017), [Lara-Clares and Garcia-Serrano, 2019](http://ceur-ws.org/Vol-2421/eHealth-KD_paper_6.pdf).
48+
- Question answering: [Yamada et al., 2017](https://arxiv.org/abs/1803.08652), [Poerner et al., 2020](https://arxiv.org/abs/1911.03681).
49+
- Entity typing: [Yamada et al., 2018](https://arxiv.org/abs/1806.02960).
50+
- Text classification: [Yamada et al., 2018](https://arxiv.org/abs/1806.02960), [Yamada and Shindo, 2019](https://arxiv.org/abs/1909.01259), [Alam et al., 2020](https://link.springer.com/chapter/10.1007/978-3-030-61244-3_9).
51+
- Relation classification: [Poerner et al., 2020](https://arxiv.org/abs/1911.03681).
52+
- Paraphrase detection: [Duong et al., 2018](https://ieeexplore.ieee.org/abstract/document/8606845).
53+
- Knowledge graph completion: [Shah et al., 2019](https://aaai.org/ojs/index.php/AAAI/article/view/4162), [Shah et al., 2020](https://www.aclweb.org/anthology/2020.textgraphs-1.9/).
54+
- Fake news detection: [Singh et al., 2019](https://arxiv.org/abs/1906.11126), [Ghosal et al., 2020](https://arxiv.org/abs/2010.10836).
55+
- Plot analysis of movies: [Papalampidi et al., 2019](https://arxiv.org/abs/1908.10328).
56+
- Novel entity discovery: [Zhang et al., 2020](https://arxiv.org/abs/2002.00206).
57+
- Entity retrieval: [Gerritse et al., 2020](https://link.springer.com/chapter/10.1007%2F978-3-030-45439-5_7).
58+
- Deepfake detection: [Zhong et al., 2020](https://arxiv.org/abs/2010.07475).
59+
- Conversational information seeking: [Rodriguez et al., 2020](https://arxiv.org/abs/2005.00172).
60+
- Query expansion: [Rosin et al., 2020](https://arxiv.org/abs/2012.12065).
6261

6362
## References
6463

0 commit comments

Comments
 (0)