Artificial Intelligence 17 min read

Fine‑Tune Amazon Nova Canvas in 12 Hours for Consistent, Cohesive AI Storyboards (Part 2)

This guide shows how to fine‑tune the Amazon Nova Canvas foundation model on Amazon Bedrock using a 12‑hour workflow that extracts character frames from video with Amazon Rekognition, prepares labeled data, configures hyper‑parameters, creates a custom model, deploys it with provisioned throughput, and tests the model to generate coherent storyboard images, while also covering cleanup steps to avoid ongoing costs.

Amazon Cloud Developers

Sep 12, 2025

Fine‑Tune Amazon Nova Canvas in 12 Hours for Consistent, Cohesive AI Storyboards (Part 2)

Solution Overview

Fine‑tuning the Amazon Nova Canvas foundation model (FM) improves character consistency and style continuity across storyboard scenes.

Workflow Architecture

Upload source video to an Amazon S3 bucket.

Trigger an Amazon ECS task to down‑sample video frames, select frames containing the target character, and crop them to centered character images.

Use Amazon ECS together with Amazon Bedrock to invoke the Amazon Nova (Pro) model and generate descriptive text for each character image.

Write the generated text and metadata back to S3.

In an Amazon SageMaker Studio notebook, start a model‑customization job via the create_model_customization_job and create_model_provisioned_throughput APIs.

Deploy the fine‑tuned model with provisioned throughput for low‑latency inference.

Two‑Phase Process

Phase 1 – Data Preparation: Extract high‑resolution character images from video, label them, and de‑duplicate using Amazon Titan Multimodal Embeddings to avoid over‑fitting.

Phase 2 – Model Fine‑Tuning and Testing: Train the custom Nova Canvas model, evaluate it, and deploy the model for inference.

GitHub repository: https://github.com/aws-samples/sample-character-consistent-storyboard/tree/main/02-character-consistent-fine-tuning-with-amazon-nova-canvas

Extracting Creative Characters

Frames are sampled at a fixed interval (e.g., one frame per second). Amazon Rekognition is used for:

Label detection (over 2,000 labels) to locate generic character categories.

Face‑collection search to match specific characters across frames.

Optional: Amazon Rekognition Custom Labels can be trained to detect a bespoke character.

After detection, each character image is centered, padded, and de‑duplicated with Amazon Titan Multimodal Embeddings based on a user‑adjustable similarity threshold.

Data Annotation

For each image, Amazon Nova Pro generates a description using a system prompt that emphasizes the three main characters and avoids repetitive phrasing. The annotation output is a JSONL file where each line pairs an S3 image reference with the generated alt_text:

{"image_ref": "s3://media-ip-dataset/characters/blue_character_01.jpg", "alt_text": "This animated character features a round face with large expressive eyes. The character has a distinctive blue color scheme with a small tuft of hair on top. The design is stylized with clean lines and a minimalist approach typical of modern animation."}

Human‑In‑The‑Loop Validation (Optional)

Enterprise use cases can insert an Amazon Augmented AI (A2I) step to let humans verify the image‑text pairs before training.

Fine‑Tuning the Model

With the annotated dataset uploaded to S3, a fine‑tuning job is created in the Amazon Bedrock console or via the Python SDK. Example hyper‑parameters that yielded good results are:

hyperParameters = {"stepCount": "14000", "batchSize": "64", "learningRate": "0.000001"}

Increasing the learning rate speeds up training but may degrade image quality; the recommended approach is to start with the default batch size and learning rate, then adjust the number of steps or batch size based on validation performance.

Creating the Fine‑Tuned Model via SDK

bedrock = boto3.client('bedrock')
jobName = "picchu-canvas-v0"
response_ft = bedrock.create_model_customization_job(
    jobName=jobName,
    customModelName=jobName,
    roleArn=roleArn,
    baseModelIdentifier="amazon.nova-canvas-v1:0",
    hyperParameters=hyperParameters,
    trainingDataConfig={"s3Uri": training_path},
    outputDataConfig={"s3Uri": f"s3://{bucket}/{prefix}"}
)
jobArn = response_ft.get('jobArn')
print(jobArn)

After the job finishes, retrieve the custom model ARN:

custom_model_arn = bedrock.list_model_customization_jobs(
    nameContains=jobName
)["modelCustomizationJobSummaries"][0]["customModelArn"]

Deploying the Model with Provisioned Throughput

provisioned_model_id = bedrock.create_provisioned_model_throughput(
    modelUnits=1,
    provisionedModelName=custom_model_name,
    modelId=custom_model_arn
)['provisionedModelArn']

Testing the Fine‑Tuned Model

Once provisioned throughput is active, the model can generate storyboard images. The following Python function calls the model and decodes base64‑encoded images:

def generate_image(prompt, negative_prompt="text, ugly, blurry, distorted, low quality, pixelated, watermark, text, deformed", num_of_images=3, seed=1):
    image_gen_config = {
        "numberOfImages": num_of_images,
        "quality": "premium",
        "width": 1024,
        "height": 1024,
        "cfgScale": 8.0,
        "seed": seed,
    }
    request_body = {
        "taskType": "TEXT_IMAGE",
        "textToImageParams": {"text": prompt, "negativeText": negative_prompt},
        "imageGenerationConfig": image_gen_config,
    }
    response = bedrock_runtime.invoke_model(modelId=provisioned_model_id, body=json.dumps(request_body))
    response_body = json.loads(response['body'].read())
    if "images" in response_body:
        return [decode_base64_image(img) for img in response_body['images']]
    else:
        return []

Sample generated frames show the character Mayu with consistent facial expression and style across different scenes.

Cleanup

To avoid ongoing charges, delete the SageMaker Studio domain and the fine‑tuned model together with its provisioned‑throughput endpoint.

Conclusion

Fine‑tuning Amazon Nova Canvas on Amazon Bedrock dramatically improves character and style consistency in AI‑generated storyboards while reducing creation time from weeks to a few hours. The end‑to‑end pipeline integrates video processing, Amazon Rekognition for character extraction, optional custom labeling, de‑duplication with Titan embeddings, hyper‑parameter tuning, and deployment with provisioned throughput.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

model fine-tuning hyperparameters Python SDK Amazon Bedrock AI storyboard Amazon Nova Canvas Amazon Rekognition

Written by

Amazon Cloud Developers

Official technical community of Amazon Cloud. Shares practical AI/ML, big data, database, modern app development, IoT content, offers comprehensive learning resources, hosts regular developer events, and continuously empowers developers.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.