TokenBuffer for preprocessing Documents

We have been using a [**fast** TokenBuffer API](https://github.com/JuliaText/WordTokenizers.jl/blob/master/src/words/fast.jl) to speed up for various tokenizers in WordTokenizers.jl. 

Referring to #141 #140, I think it might be beneficial to extend the TokenBuffer API for Documents and Corpus that TextAnalysis.jl offers (excluding NGramDocument and TokenDocument). 
This can then be used to improve the performance for [preprocessing.jl](https://github.com/JuliaText/TextAnalysis.jl/blob/master/src/preprocessing.jl).

**Edit:** This could also serve as a solution for #74 #76 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TokenBuffer for preprocessing Documents #143

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TokenBuffer for preprocessing Documents #143

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions