Implemented LLM performance test cases #76

VivekSinghDS · 2024-02-20T16:28:52Z

The tests cover different aspects for testing LLM performance. Some of them are as follows :

Length: Measures the absolute difference in length between the ground truth and model prediction.
Jaccard similarity: Calculates the Jaccard similarity score between the ground truth and model prediction sets.
Dot product similarity: Measures the similarity between the ground truth and model prediction embeddings using dot product.
ROUGE score: Computes the ROUGE-1 score between the ground truth and model prediction.
Word overlap: Calculates the percentage of overlapping words between the ground truth and model prediction after removing stop words.
Part-of-speech composition: Analyzes the percentage of verbs, adjectives, and nouns in the model prediction.

The tests cover different aspects for testing LLM performance. Some of them are as follows : 1. Length: Measures the absolute difference in length between the ground truth and model prediction. 2. Jaccard similarity: Calculates the Jaccard similarity score between the ground truth and model prediction sets. 3. Dot product similarity: Measures the similarity between the ground truth and model prediction embeddings using dot product. 4. ROUGE score: Computes the ROUGE-1 score between the ground truth and model prediction. 5. Word overlap: Calculates the percentage of overlapping words between the ground truth and model prediction after removing stop words. 6. Part-of-speech composition: Analyzes the percentage of verbs, adjectives, and nouns in the model prediction.

RohitSaha merged commit 92eaa2d into georgian-io:feature-llm-qa Feb 21, 2024

VivekSinghDS deleted the patch-1 branch February 23, 2024 05:34

benjaminye added the enhancement New feature or request label Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implemented LLM performance test cases #76

Implemented LLM performance test cases #76

Uh oh!

VivekSinghDS commented Feb 20, 2024

Uh oh!

Uh oh!

Implemented LLM performance test cases #76

Implemented LLM performance test cases #76

Uh oh!

Conversation

VivekSinghDS commented Feb 20, 2024

Uh oh!

Uh oh!