Skip to content

Commit af83c4d

Browse files
committed
update README
1 parent ec25189 commit af83c4d

File tree

3 files changed

+24
-23
lines changed

3 files changed

+24
-23
lines changed

README-pypi.md

Lines changed: 8 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -10,20 +10,14 @@
1010

1111
PyThaiNLP is a Python library for natural language processing (NLP) of Thai language.
1212

13-
PyThaiNLP features include Thai word and subword segmentations, soundex, romanization, part-of-speech taggers, and spelling corrections.
14-
15-
## What's new in version 1.7 ?
16-
17-
- Deprecate Python 2 support. (Python 2 compatibility code will be completely dropped in PyThaiNLP 1.8)
18-
- Refactor pythainlp.tokenize.pyicu for readability
19-
- Add Thai NER model to pythainlp.ner
20-
- thai2vec v0.2 - larger vocab, benchmarking results on Wongnai dataset
21-
- Sentiment classifier based on ULMFit and various product review datasets
22-
- Add ULMFit utility to PyThaiNLP
23-
- Add Thai romanization model ThaiTransliterator
24-
- Retrain POS-tagging model
25-
- Improved word_tokenize (newmm, mm) and dict_word_tokenize
26-
- Documentation added
13+
PyThaiNLP includes Thai word tokenizers, transliterators, soundex converters, part-of-speech taggers, and spell checkers.
14+
15+
## What's new in version 1.8 ?
16+
17+
- New NorvigSpellChecker spell checker class, which can be initialized with custom dictionary.
18+
- Terminate Python 2 support. Remove all Python 2 compatibility code.
19+
- Remove old, obsolated, deprecated, and experimental code.
20+
- see [PyThaiNLP 1.8 change log](https://github.com/PyThaiNLP/pythainlp/issues/118)
2721

2822
## Install
2923

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,9 @@ Thai Natural Language Processing in Python.
1212

1313
PyThaiNLP is a Python package for text processing and linguistic analysis, similar to `nltk` but with focus on Thai language.
1414

15-
PyThaiNLP supports Python 3.4+.
16-
Since version 1.7, PyThaiNLP deprecates its support for Python 2. The future PyThaiNLP 1.8 will completely drop all supports for Python 2.
17-
Python 2 users can still use PyThaiNLP 1.6.
15+
PyThaiNLP 1.8 supports Python 3.6+. Some functions may work with older version of Python 3, but it is not well-tested and will not be supported. See [PyThaiNLP 1.8 change log](https://github.com/PyThaiNLP/pythainlp/issues/118).
16+
17+
Python 2 users can use PyThaiNLP 1.6, our latest released that tested with Python 2.7.
1818

1919
**This is a document for development branch (post 1.7.x). Things will break. For a document for stable branch, see [master](https://github.com/PyThaiNLP/pythainlp/tree/master).**
2020

tests/__init__.py

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -113,14 +113,18 @@ def test_wordnet(self):
113113

114114
self.assertIsNotNone(wordnet.lemmas("นก"))
115115
self.assertIsNotNone(wordnet.all_lemma_names(pos=wn.ADV))
116-
self.assertIsNotNone(wordnet.lemma('cat.n.01.cat'))
116+
self.assertIsNotNone(wordnet.lemma("cat.n.01.cat"))
117117

118118
self.assertEqual(wordnet.morphy("dogs"), "dog")
119119

120-
bird = wordnet.synset('bird.n.01')
121-
mouse = wordnet.synset('mouse.n.01')
122-
self.assertEqual(wordnet.path_similarity(bird, mouse), bird.path_similarity(mouse))
123-
self.assertEqual(wordnet.wup_similarity(bird, mouse), bird.wup_similarity(mouse))
120+
bird = wordnet.synset("bird.n.01")
121+
mouse = wordnet.synset("mouse.n.01")
122+
self.assertEqual(
123+
wordnet.path_similarity(bird, mouse), bird.path_similarity(mouse)
124+
)
125+
self.assertEqual(
126+
wordnet.wup_similarity(bird, mouse), bird.wup_similarity(mouse)
127+
)
124128

125129
cat_key = wordnet.synsets("แมว")[0].lemmas()[0].key()
126130
self.assertIsNotNone(wordnet.lemma_from_key(cat_key))
@@ -542,7 +546,10 @@ def test_thai2vec(self):
542546
self.assertGreaterEqual(thai2vec.similarity("แบคทีเรีย", "คน"), 0)
543547
self.assertIsNotNone(thai2vec.sentence_vectorizer(""))
544548
self.assertIsNotNone(thai2vec.sentence_vectorizer("เสรีภาพในการชุมนุม"))
545-
self.assertIsNotNone(thai2vec.sentence_vectorizer("I think therefore I am ผ็ฎ์"))
549+
self.assertIsNotNone(
550+
thai2vec.sentence_vectorizer("เสรีภาพในการสมาคม", use_mean=True)
551+
)
552+
self.assertIsNotNone(thai2vec.sentence_vectorizer("I คิด therefore I am ผ็ฎ์"))
546553
self.assertEqual(
547554
thai2vec.most_similar_cosmul(["ราชา", "ผู้ชาย"], ["ผู้หญิง"])[0][0],
548555
"ราชินี",

0 commit comments

Comments
 (0)