Manage Datasets

Datasets are the secret sauce of generative AI. What kind of data you train with — and how it is incorporated into your prompt — is what ultimately makes your fine-tuned model unique. GenAI Studio supports CSV datasets and HuggingFace datasets.

All uploaded datasets are scoped to the project level.

Linked Data Templating

Once you have uploaded a dataset, you can then link it into your prompt input using a templating syntax that maps to the column names in your dataset. For example, if you have a dataset with the following columns: title, text, and label, you can link the dataset to your prompt using the following templating syntax:

Title: {{title}}
Text: {{text}}
Label: {{label}}

This templating syntax supports auto-completion and is case-sensitive, so inputs must match the column names in your dataset exactly.

Model Creation Journey

Datasets should be uploaded to your project at the beginning of your journey to be used for comparing base models, creating snapshots, fine-tuning models, and testing an output model.

graph LR;
    A(Create Project) --> B(Import Data);
    B --> C(Snapshot Model);
    C --> D(Fine-Tune Model);
    D --> E(Export Model);
    style A fill:#fff,stroke:#333,stroke-width:2px;
    style B fill:#7FF9E2,stroke:#333,stroke-width:4px;
    style C fill:#fff,stroke:#333,stroke-width:2px;
    style D fill:#fff,stroke:#333,stroke-width:2px;
    style E fill:#fff,stroke:#333,stroke-width:2px;