Guide · 8 min read
App Preview Video Formula: The 5-Frame Structure That Converts (2026)
App Store preview videos autoplay silently in Search results before a user ever taps your listing — and most waste that opportunity in the first five seconds. Apple allows up to three 30-second videos per language, and well-optimized previews consistently lift conversion 20–35% over screenshot-only listings. The structure that converts is not a screen recording walkthrough of your app. It's a deliberately sequenced five-frame argument, each segment doing a specific job. Here's what goes where, and why most indie developer previews fail the same way.
The 5-frame structure — how to allocate 30 seconds without wasting any of them
The five-frame structure divides a 30-second preview into distinct jobs: Hook (0–5s), first feature delivery (5–12s), second feature delivery (12–18s), credibility layer (18–24s), and landing CTA (24–30s). Every second maps to one of these five segments. The structure exists because attention in the App Store browse context is borrowed, not given — users are scrolling, your video is fighting scroll momentum, and each frame has to lock attention or pass the baton before they scroll away.
The most common structural error is treating the full 30 seconds as a demonstration. A demonstration shows what the app does. The five-frame structure makes an argument for why someone should install now. Most apps answer the wrong question — 'what is this?' — when the question users need answered is 'why do I need this today?' The formula below answers the right question, segment by segment.
The first five seconds autoplay in Search — that's your hook, not your home screen
App Store Search results autoplay preview videos inline — silently — without any tap required. The thumbnail-sized clip that plays is the first three to five seconds of your video. This means your hook is not what users see when they tap into your listing: it is what they see while still scrolling the Search results page. If those five seconds show a loading animation, an empty home screen, or a fade from black, you have wasted the one moment where your video competes directly with every other result on the page.
The hook should show your app's single best output — the finished state, not the starting point. A habit tracker showing a 60-day streak. A photo editor showing a before/after that stops a scroll. A writing app showing a polished, completed document. Whatever your app produces at its best: that's frame one. This is the same principle as the first-screenshot rule — lead with the result, not the tool. The hook is a result that earns the next 25 seconds.
One technique that works: record the complete final state of your app and open your video with a three-second hold on that single frame, with a caption naming the outcome. This is the inverse of most app walkthroughs, which build from launch to completion. Opening with the destination — then showing how fast it arrives — unlocks skeptical viewers differently. They already know the payoff is real; now they want to see whether getting there is actually easy.
Two features beat four — why the feature payload requires restraint
Seconds 5 through 18 are your feature payload — roughly 13 seconds of the highest attention in the video, because the viewer chose to keep watching after the hook. The instinct is to use it for maximum coverage: show every feature, hit every selling point, prove the install is justified. The outcome of that instinct is a carousel nobody processes. At four seconds per feature (genuinely fast), 13 seconds covers three features maximum. Three features with context is worse than two features with comprehension.
Pick the two features unique to your app — not the ones every competitor also has. A task manager that shows 'add a task' is demonstrating existence, not differentiation. Show the feature that would make a user delete the competing app they already have. For each feature, structure the clip as: UI state before → action → result. That three-beat pattern registers in four to six seconds. If a feature needs longer than six seconds to demo clearly, the demo needs to be tighter, not the slot longer.
One pattern that works for productivity and utility apps: show features operating on realistic data. A to-do app with three items labeled 'Task 1', 'Task 2', 'Task 3' communicates nothing about how the app feels to live with. The same app with 'Q2 product launch', 'review PR #44', 'call Lisa at 3pm' feels lived-in and trustworthy. Invest time in demo content; it shapes the emotional read of every feature you show. Use the template library as a reference for framing that works both in static screenshots and video storyboards.
Social proof belongs at second 18, not buried in the description
By second 18, the viewer has processed your hook and your features. They are in the evaluation phase — the mental moment where 'this looks interesting' either becomes 'installing now' or 'scrolling past.' This is the highest-leverage moment for social proof, and almost no indie app previews use it. A rating count, a press badge, or a '4.9 ★ from 12,000 reviews' frame does more conversion work here than the same information buried in your App Store description, where 97.5% of visitors never scroll.
If your app is new and has no ratings yet, use this segment as a deeper-feature moment — something screenshots cannot show. An animation, a complex interaction, a workflow spanning three screens. Screenshots freeze one moment. This six-second window lets you show a flow that takes three seconds to unfold — the kind of interaction that makes users think 'I didn't know an app could do that.' That reaction, at second 18, is what converts skeptics. See the screenshot editor for building the static before/after frames that bracket these dynamic moments.
The landing frame is the ad you forgot to make
Seconds 24 through 30 are the most wasted segment in most app previews. The video ends on the home screen, fades to the app icon, or just loops. These are structural defaults, not decisions. The landing frame is the last thing a viewer sees before making their install decision — it should be designed like a standalone ad: your app name, a single value statement, and your best credibility signal. Anything else wastes the closing argument on a formality.
The value statement on the landing frame should differ from your App Store title — the title carries keyword constraints, but the landing frame is purely persuasive. The pattern that converts: verb + benefit + time qualifier. 'Build habits in 60 seconds.' 'Meditate without the complexity.' 'Invoice faster than you can type.' A single verb-led statement in large type, held for four seconds, gives the viewer a durable memory of what your app is for before they exit. That memory drives the install tap minutes or hours later.
Sound is muted by default — subtitle your own video
Apple plays all App Store preview videos with audio muted by default. The viewer must actively tap to unmute. Any narration, instruction, or audio cue in your video is inaudible unless text on screen carries the same information. Most apps either record with narration they assume users will hear — they won't — or record with no text at all, leaving the video without a second content layer. The correct approach: treat every second as a silent film and ensure every meaningful moment has a caption.
Caption text in a preview video serves double duty: it communicates to muted viewers and — like screenshot overlay text — is potentially indexed by Apple's OCR for App Store search. The keyword discipline that applies to screenshot headlines applies here. Every caption should name the feature or benefit in the language a user would type into App Store Search. 'Daily habit tracker' beats 'Stay consistent.' 'Invoice scanner' beats 'Works automatically.' Precision converts; poetry doesn't rank. Check the screenshot formula guide for caption writing principles that apply equally to video frames.
Poster frame selection drives click rate before the video plays
You can choose any frame from your video as the poster frame — the static image shown before autoplay begins and in contexts where autoplay isn't available. Apple defaults to the first frame, which is frequently a loading state, a partial transition, or a design placeholder. Every developer should select a poster frame manually. The correct choice is your best high-contrast result frame — equivalent to what you would put in screenshot slot 1.
The poster frame matters in two specific contexts: device types where autoplay is disabled (some older iPads, some accessibility settings), and when your listing is viewed on apps.apple.com where video autoplay varies by browser. In both cases, the poster frame is your only visual from the video. If it shows a modal dialog, a partial transition, or test data, you have undermined your preview with a still that would fail screenshot review. Set it deliberately in App Store Connect's media tab.
The silent rejection specs — what kills uploads without a clear error
App Store Connect silently rejects preview uploads that don't meet technical requirements — 'silently' meaning the upload appears to complete but the video fails to process or surfaces a vague error hours later. The requirements causing the most silent rejections: variable frame rate recordings (App Store Connect requires constant 30fps — iOS screen recordings default to variable frame rate and must be converted before upload), resolution mismatches (video dimensions must exactly match the required dimensions for your target device slot), and durations outside 15–30 seconds (31 seconds is rejected; there is no tolerance).
Apple's full technical requirements are documented at developer.apple.com/app-store/app-previews. The one-upload-covers-all-sizes rule saves time: a video for the largest device in each display family — currently iPhone 6.9" at 886×1920 portrait — auto-fits all smaller iPhones in that family. You only need separate uploads for fundamentally different form factors: iPhone and iPad each require their own video. On the encoding side: H.264 High Profile Level 4.0, AAC stereo at 48kHz, YUV 4:2:0 color space. ProRes 422 HQ is also accepted and preserves quality before Apple recompresses on delivery.
For the variable frame rate problem specifically: iOS screen recording via Control Center outputs variable frame rate by default, which App Store Connect rejects. Convert to constant 30fps using HandBrake (free, open source) before uploading. Alternatively, record in Xcode Simulator, which outputs constant frame rate natively. For apps with device-specific UI — Dynamic Island animations, StandBy mode — always record on real hardware, then convert before upload. The screenshot sizes reference covers every required video and image dimension for iOS and Android in one place.
Apply the structure to one preview before your next update
The five-frame structure is observable in every top-converting preview on the App Store right now. Open any top-grossing app in your category and map the preview to the segments above — hook, features, credibility, landing. The pattern is there. What you won't find in any of them is a screen recording walkthrough from launch to completion.
Build the static frames that anchor the before/after moments, record the transitions between them, subtitle every caption, set your poster frame deliberately, and convert your recording to constant 30fps before you upload. That is the full checklist — everything else is design taste.
Build screenshot frames that complement your preview →
Frequently asked questions
How long should an App Store preview video be?
Apple requires preview videos to be between 15 and 30 seconds. A 31-second video is rejected; a 14-second video is rejected. There is no tolerance in either direction. The optimal length is 28–30 seconds — close to the maximum — which gives you enough time to execute the five-frame structure without padding. Very simple utility apps where the core value is demonstrable in under 20 seconds can use a shorter clip effectively.
Can my app preview video include narration or music?
Yes — Apple allows audio in preview videos. However, all App Store previews play with sound muted by default. The viewer must tap to unmute. Any narration, instructions, or audio cues are inaudible unless you also display them as text on screen. Treat narration as a bonus layer for users who unmute, not the primary communication channel. Every meaningful moment in your video needs a caption that works without audio.
Do I need a separate preview video for each iPhone size?
No. Uploading a video at the largest device size in each display family automatically covers all smaller sizes in that family. Currently, one video at 886×1920 portrait (iPhone 6.9") covers all iPhones with the Dynamic Island. You only need separate uploads for fundamentally different form factors — iPhone and iPad each require their own video at the appropriate dimensions.
Why is my App Store Connect preview video upload failing?
The most common causes of silent rejection: (1) variable frame rate — iOS screen recordings default to variable frame rate, but App Store Connect requires constant 30fps; convert with HandBrake or record in Xcode Simulator; (2) resolution mismatch — video dimensions must exactly match the required dimensions for your target device slot; (3) duration outside 15–30 seconds exactly; (4) encoding format — H.264 High Profile Level 4.0 is required; other profile variants are rejected.
How many preview videos can I upload per app?
Apple allows up to three app preview videos per language your app supports. Each localization can have its own set of three previews. Use this to test different hooks or angles — treat each of the three slots as a creative variant targeting a different user motivation, then check App Store Connect analytics to see which preview drives the most time-spent-on-page before install.