How HiVideo Works: From Upload to Final Video in 15 Minutes

HiVideo turns two simple inputs—a motion video and a character image—into a finished AI-generated video in about 15 minutes. No video editing skills required.

This guide walks through exactly how the process works, what happens behind the scenes, and how to get the best possible results from your generations.

The Two Inputs You Need

Every HiVideo generation starts with two things: a motion reference and a character.

Motion Reference Video

This is a video showing the movement you want your character to perform. It could be you recording yourself, a stock video clip, or any footage with clear human movement.

What makes a good reference:

Clear, well-lit footage
Subject visible from shoulders up (minimum)
Stable camera (tripod or steady hands)
Simple background (not required, but helps)
10 seconds or less (longer videos can be trimmed)

Supported formats: MP4, MOV (max 100MB)

Character Image

This is who will perform the motion. You have three options:

AI Characters (Built-in): HiVideo includes ready-to-use AI-generated characters. Professional-looking, diverse options, no upload needed.
Upload Your Own: Use any image—a photo, illustration, AI-generated art, or digital avatar. PNG, JPG, or WebP formats.
Face Swap: Start with your reference video's person, but swap their face with a different character. Useful when you like a reference video but want a different presenter.

The Generation Process (Step by Step)

Here's exactly what you do in HiVideo:

Step 1: Upload Your Motion Video

Drag and drop or click to upload. If your video is longer than 10 seconds, you'll be prompted to trim it to select the best 10-second segment.

Step 2: Choose Your Character

Pick from AI characters, upload your own image, or use face swap. The character selection shows you exactly what your output will look like.

Step 3: Add a Scene Description (Optional)

A text prompt that describes the scene—not the motion (that comes from your video). Examples:

"A professional studio with soft lighting"
"Outdoor setting with natural sunlight"
"Minimalist white background"

Step 4: Generate

Click the button. Your video enters the processing queue.

Step 5: Wait (~15 minutes)

AI generation takes time. You'll receive an email when your video is ready, so you don't need to keep the page open.

Step 6: Configure Audio

Once your video is generated, choose whether to keep the original audio from your reference video, add a new voice, or keep it silent.

Step 7: Download

Your finished video is ready to download. Use it directly or bring it into your video editor for further work.

What Happens Behind the Scenes

When you click generate, here's what our AI systems do:

Pose Estimation

First, we analyze your reference video frame by frame. AI identifies body keypoints—head, shoulders, elbows, wrists, hips—tracking their position throughout the video. This creates a motion "skeleton."

Facial Landmark Tracking

Separately, we track 68+ facial landmarks: eyes, eyebrows, nose, mouth, jaw line. This captures the subtle expressions that make video feel human.

Character Analysis

The AI studies your character image, understanding its proportions, style, lighting, and structure. This ensures the output matches your character's appearance.

Motion Mapping

The extracted motion data is translated to your character's proportions. A tall character and a short character move differently even when performing the same action—the AI accounts for this.

Frame-by-Frame Generation

Each video frame is generated individually. The AI creates your character in each pose while maintaining visual consistency.

Temporal Coherence

Raw frame-by-frame generation can look jittery. A smoothing pass ensures natural movement transitions and consistent appearance across frames.

Audio Processing

If you kept original audio, it's synchronized with the generated video. The frame timing matches the original, so audio stays in sync.

Final Composition

Everything comes together into your downloadable MP4 file.

Tips for Best Results

Get more out of every generation with these practical tips:

Reference Video Tips

Film in good lighting (natural light or well-lit room)
Keep the camera steady (tripod recommended)
Face the camera directly for best face tracking
Wear solid colors (patterns can confuse the AI)
Move at a moderate pace (very fast motion may not track well)
Keep movements within frame (don't go off-screen)

Character Image Tips

Use high-resolution images (1024px+ on the short side)
Choose images with clear, visible faces
Front-facing or slight angles work best
Consistent lighting in the image helps
AI-generated characters often work better than photos

Prompt Tips

Describe the environment, not the action
Keep it simple: "studio lighting" beats a paragraph
Don't contradict your reference (indoor motion + "outdoor beach" = confusion)

When to Use Face Swap

You have a great reference video but want a different face
You want a specific person's likeness on motion from stock footage
You're building a consistent character across many videos

Troubleshooting Common Issues

Something not right? Here's how to fix common problems:

Character looks distorted

Check your character image resolution (too small = poor results)
Try a different character image with clearer features
Ensure the character image has good lighting

Motion doesn't match reference

Your reference video may be too dark or blurry
Try a reference with the subject more centered
Avoid references with multiple people (AI may get confused)

Output looks jittery

Reference video may have inconsistent lighting
Try a reference with smoother, slower movements
Check if reference has frame drops or compression artifacts

Generation failed

File may be too large (100MB limit)
Format may not be supported (use MP4 or MOV)
Try re-uploading the file

Audio out of sync

This is rare but can happen with variable frame rate videos
Try converting your reference to constant frame rate before upload
Or generate without audio and add it in post

Conclusion

HiVideo's process is designed to be simple: upload two files, click a button, get a video. The AI handles the complex work of motion extraction, character mapping, and video generation.

The best way to learn is to try it. Your first generation will teach you more than any guide.

Ready to create your first AI video?

Upload a motion reference, choose a character, and see the results in 15 minutes.