Skip to content

Wrong ADP POS-Tag allocation in the german model  #1184

@Goldritter

Description

@Goldritter

Hi,

I use both the stanford-corenlp and the german model in version "4.2.2" and found some odd behavior in the allocation of POS-Tags für words like "für" and "vor". These two words should be normally designed with "ADP" but I often get "NOUN" or even "PROPN" as POS-Tag for these words.

For example for the sentence "Welcher der Befunde ist für eine Gehirnerkrankung typisch?" "für" is tagged as "PROPN" an in "Welcher der Befunde ist am ehesten für hier am wahrscheinlichsten vorliegende Gehirnerkrankung typisch?" it is designated as "NOUN".

But for the sentences "Für wen ist das Essen?" and "Ich war reif für das Bett." "für" is correctly tagged as "ADP".

Here are the properties I used for the initialization:
(.setProperty properties "annotators" "tokenize, ssplit, mwt , pos, lemma, ner, parse, depparse")
(.setProperty properties "coref.algorithm", "neural")
(.setProperty properties "depparse.language", "german")
(.setProperty properties "ner.language", "de")
(.setProperty properties "tokenize.options" "untokenizable=noneDelete")

These are all the information I have. I hope they are helpfull.

Best regards
Goldritter

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions