Skip to content

Integration with gensim or fasttext #66

@petulla

Description

@petulla

First, thank you so much for this package. It's very useful.

I know you've already more or less answered this here. I wasn't able to reproduce the example. I think it used an older version of fasttext.

import fasttext as ft
model2=ft.load_model('fasttext/wiki.en.bin')

class FastTextEmbeddings(object):
    def __getitem__(self, item):
        item = np.array(item, copy=True)
        item[item > len(fastText_wv) # for testing insert a value here] = -1
        return fastText_wv.get_input_vector(item)

There are a few issues:
--fastText_wv does not return a length. That said, we can use one for testing by looking at the output length when loading the model.
--get_input_vector cannot be called.

getInputVector(): incompatible function arguments. The following argument types are supported:
    1. (self: fasttext_pybind.fasttext, arg0: fasttext_pybind.Vector, arg1: int) -> None

Invoked with: <fasttext_pybind.fasttext object at 0x2ab5724b0>, <fasttext_pybind.Vector object at 0x2a29c5370>, array(18446744073709551615, dtype=uint64)

This is with fasttext 0.9.1

I'm sure you have this working with gensim or fasttext. I'm wondering if you could share your code example as you did with Numpy. I'm not sure where to start on debugging. Most of the methods in gensim require supplying the token over the word index.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions