Chatbot Project

Overview

In this project, I develop an AI-powered chatbot using BERT, a leading model for natural language understanding. I start by loading and analyzing conversational data, then split it into training, validation, and test sets. I use a pretrained BERT model and tokenizer to convert text data into numerical format and fine-tune the model on my dataset. Finally, I evaluate the chatbot's performance and save the trained model for future use.

Installation

To run the chatbot locally, follow these steps:

Colab Link to notebook: Open in Colab
Interact with the chatbot by typing your queries and receiving responses.

Data Analysis

Before diving into data preprocessing, an essential step is to analyze the dataset to gain insights and understand the distribution of the data. This involves:

Exploratory Data Analysis (EDA): Conducting a thorough EDA to understand the data's structure, common patterns, and potential outliers.
Visualization: Using visual tools like histograms, bar charts, and word clouds to represent the distribution of different classes, most frequent words, and conversation lengths.
Statistical Summary: Generating summary statistics to evaluate the mean, median, mode, and standard deviation of various features in the dataset.

By performing these analyses, we can identify key characteristics of the conversational data, which will inform the subsequent steps in data preprocessing and model training.

Data Preprocessing

The dataset used for fine-tuning the pre-trained model underwent preprocessing to ensure quality and relevance. Steps involved in data preprocessing include:

Data cleaning
Tokenization
Lemmatization
Removing stopwords
Data augmentation (if applicable)

Model Implementation

The chatbot utilizes a pre-trained model as its foundation, which was further fine-tuned using the provided dataset. Model implementation includes:

Loading the pre-trained model
Fine-tuning the model with the dataset
Saving the fine-tuned model for deployment

Model Evaluation

The chatbot's performance was evaluated using various metrics, including:

Accuracy
Precision
Recall
F1 Score
Confusion Matrix

The evaluation process helps assess the chatbot's effectiveness in handling user queries and providing accurate responses.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Training_Testing_and_Important_Data		Training_Testing_and_Important_Data
.gitattributes		.gitattributes
=0.21.0		=0.21.0
README.md		README.md
Task_ChatBot.ipynb		Task_ChatBot.ipynb
config.json		config.json
intents.json		intents.json
model.safetensors		model.safetensors
special_tokens_map.json		special_tokens_map.json
testing.csv		testing.csv
tokenizer_config.json		tokenizer_config.json
topical_chat.csv		topical_chat.csv
training.csv		training.csv
training_args.bin		training_args.bin
vocab.txt		vocab.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Chatbot Project

Overview

Table of Contents

Installation

Data Analysis

Data Preprocessing

Model Implementation

Model Evaluation

About

Uh oh!

Releases

Packages

Languages

Chamindu77/Task-ChatBot

Folders and files

Latest commit

History

Repository files navigation

Chatbot Project

Overview

Table of Contents

Installation

Data Analysis

Data Preprocessing

Model Implementation

Model Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages