Skip to content

gpu-mode/kernelbot-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hugging Face Dataset Exporter

This script exports data from a Postgres database to a Hugging Face dataset in Parquet format.

Setup

  1. Install dependencies:

    Navigate to this directory and install the required Python packages.

    pip install -r requirements.txt
  2. Set environment variables:

    The script uses environment variables for database credentials. You can set them in your shell or use a .env file.

    export DB_USER="your_db_user"
    export DB_PASSWORD="your_db_password"
    export DB_HOST="localhost"
    export DB_PORT="5432"
    export DB_NAME="your_db_name"

Usage

Run the script from the root of the repository:

python export.py

The script will create a directory at the specified output path containing the dataset in Parquet format. If --output_dir is not provided, it will save to dataset in the current working directory.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages