notesum.ai

Published at December 6

Continuous Video Process: Modeling Videos as Continuous Multi-Dimensional Processes for Video Prediction

cs.CV

cs.AI

cs.LG

stat.ML

Released Date: December 6, 2024

Authors: Gaurav Shrivastava¹, Abhinav Shrivastava¹

Aff.: ¹University of Maryland, College Park

Arxiv: http://arxiv.org/pdf/2412.04929v1

Refer to caption

KTH [10 $\rightarrow$ $\#\text{pred}$ ; trained on $k$ ]	$k$	$\#\text{pred}$	FVD $\downarrow$	PSNR $\uparrow$	SSIM $\uparrow$
SVG-LP [15]	10	30	377	28.1	0.844
SAVP [28]	10	30	374	26.5	0.756
MCVD [51]	5	30	323	27.5	0.835
SLAMP [1]	10	30	228	29.4	0.865
SRVP [20]	10	30	222	29.7	0.870
RIVER [12]	10	30	180	30.4	0.86
CVP (Ours)	1	30	140.6	29.8	0.872
Struct-vRNN [31]	10	40	395.0	24.29	0.766
SVG-LP [15]	10	40	157.9	23.91	0.800
MCVD [51]	5	40	276.7	26.40	0.812
SAVP-VAE [28]	10	40	145.7	26.00	0.806
Grid-keypoints [21]	10	40	144.2	27.11	0.837
RIVER [12]	10	40	170.5	29.0	0.82
CVP (Ours)	1	40	120.1	29.2	0.841