Tuesday, February 7
Monday, February 6
Variational approach to interpolate and correct adhesion bias
in low-baseline stereo correlation
Stereo correlation techniques used to automatically compute DEMs (Digital Elevation Models) by photogrammetry typically use stereo pairs with a relatively high baseline/height ratio, in order to diminish the relative importance of the adhesion phenomenon, which is a distortion of the model that appears near strong discontinuities or borders of the image. This phenomenon is directly related to the correlation process, and the magnitudes of the artifacts cannot be neglected when trying to obtain sub-pixel accuracies. Correctly modelling and correcting this bias will allow the use of stereo pairs with much lower b/h ratios, which has the great advantage of avoiding many problems due to the occluded parts in the image.
The work by Delon and Rougé characterizes this phenomenon, giving a link between measured and true disparities, and allowing detection of uncorrelatable regions (or regions providing no useful information for correlation). Since this leads to a very ill posed system of equations, many simplifying assumptions have been adopted in order to easily solve it, leading to the so called barycentric correction of the adhesion phenomenon. Even though the result is highly improved with respect to the raw correlation disparities, one still observes a slightly blurred disparity map, which is specially annoying in urban areas. In this work we propose more precise and natural assumptions to solve this system, namely to regularize the solution by a modified minimal surface term. Such an approach is naturally expected to allow less blurred edges while still filling in empty areas (without meaningful correlation information) in a reasonable manner.
Our future research will explore the extension of these techniques to motion estimation in image sequences, and its application (in conjunction with our irregular sampling work) to multi-frame superresolution.
Evgeniy Bart (University of Minnesota Twin Cities) http://www.ima.umn.edu/~bart/
Cross-generalization: learning novel classes from a single example by feature replacement
We develop an object classification method that can learn a novel class from a single training example. In this method, experience with already learned classes is used to facilitate the learning of novel classes. Our classification scheme employs features that discriminate between class and non-class images. For a novel class, new features are derived by selecting features that proved useful for already learned classification tasks, and adapting these features to the new classification task. This adaptation is performed by replacing the features from already learned classes with similar features taken from the novel class. A single example of a novel class is sufficient to perform feature adaptation and achieve useful classification performance. Experiments demonstrate that the proposed algorithm can learn a novel class from a single training example, using 10 additional familiar classes. The performance is significantly improved compared to using no feature adaptation. The robustness of the proposed feature adaptation concept is demonstrated by similar performance gains across 107 widely varying object categories.
Digital restoration of antique documents
Antique documents such as photographic prints and books can be affected by several kinds of artefacts: foxing/yellowing, water blotches, fragmented glass plate, screening, etc. Each specific "problem" can be attacked by using advanced algorithms able to recover the original appearance. In this work a brief review of our solutions for virtual restoration is reported. Also some visual example will be depicted just to verify the effectiveness of the proposed approaches.
Novel image processing algorithms for European
Union Digital Cinema projects
Starting in 2002, Universitat Pompeu Fabra (Barcelona, Spain) has been a
partner in several Digital Cinema projects for the European Union,
involving major European companies. Within this framework, our Image
Processing Group has developed several novel algorithms for digital
cinema postproduction and exhibition. These works include:
-A Day for Night algorithm that accurately models human visual perception regarding color and contrast modification, but also loss of accuity through a novel anisotropic diffusion Partial Differential Equation.
-A Depth of Field algorithm that performs real-time, accurate depth of field simulation by running an anisotropic diffusion equation on a programmable graphics card.
-A robust tracking algorithm that improves the Geodesic Active Regions formulation.
-A fast and robust segmentation algorithm based on the Tree of Shapes formulation.
-An Interlaced to Progressive Conversion algorithm that achieves real-time, state of the art results on a regular PC by implementing a variational energy minimization approach on a graphics card.
A suitable detection and tracking approach is proposed for line scratch removal in a digital film restoration process. Unlike impulsive distortions such as dirt spots, which appear randomly in an image, line scratch artifacts persist across several frames. Hence, motion compensated methods will fail for persistent line scratches. Single-frame based methods will also fail if scratches are unsteady or fragmented. The proposed method uses as input a composite image built up from projections of each image of the original sequence. First, a simple 1D-extrema detector provides line scratch candidates for both bright and dark scratches. Next, a MHT (Multiple Hypothesis Tracker) stage uses these candidates to create and keep multiple hypothesis. As the tracking goes further through the sequence, each hypothesis gains or looses evidence. To avoid a combinatorial explosion, the hypothesis tree is sequentially pruned. As hypothesis are set up at each iteration, even if no information is available, a tracked path might cross gaps (missed detection or speckled scratches). The tracking stage then feeds the correction process with valid scratch trajectories.
Denoising archival films using a field-of-experts model of
film grain and natural image statistics
Bayesian denoising of archival film requires a likelihood model that captures the image noise and a spatial prior that captures the statistics of natural scenes. For the former we learn a statistical model of film noise that varies as a function of image brightness. For the latter we use the recently proposed Field-of-Experts framework to learn a generic image prior that capture the statistics of natural scenes. The approach extends traditional Markov Random Field (MRF) models by learning potential functions over extended pixel neighborhoods. Field potentials are modeled using a Products-of-Experts framework that exploits non-linear functions of many linear filter responses. In contrast to previous MRF approaches all parameters, including the linear filters themselves, are learned from training data. The prior model alone can be used to inpaint missing image structures and the data noise model can be used to simulate realistic film grain. Additionally we demonstrate how the learned likelihood and prior models can be used to denoise archival film footage.Joint work with Stefan Roth and Teodor Moldovan.
Movie restoration methods
The state of the art movie restoration methods compensate the motion by an optical flow estimate and then filter out the compensated movie. Now, the motion estimation problem is fundamentally ill-posed. This fact is known as the aperture problem: trajectories are ambiguous since they could coincide with any promenade in the space-time isophote surface.In this talk, we show that the aperture problem can be taken advantage of. This observation leads to use for movies the recently introduced NL-means algorithm. This static 3D algorithm involves the whole movie isophote and not just a trajectory.
Computer vision success stories in visual media production : Camera
tracking and motion capture
The twin technologies of camera tracking and motion capture are key components in the modern movie production pipeline, without which such effects-laden productions as "Revenge of the Sith" and "The Lord of the Rings" simply would not be possible. Recent advances have been driven by the successful application of algorithms developed in the computer vision research community to these real-world problems. The resulting highly automated, robust software solutions have greatly reduced the time and level of specialist skill required of the operator, hence reducing the overall costs of camera tracking and motion capture. Consequently, these technologies are now commonly being used in much lower budget productions such as television advertising, music promos and video games. I will talk about the underlying problems involved in camera tracking and motion capture, and will illustrate the modern approaches to them using 2d3's "boujou" camera tracker and ViconPeak's "IQ" motion capture software.
The grammar of storytelling on film survives technology
An art form barely 100 years old has seen the evolution of film grammar as new technologies emerge. Just as paintings changed when oil paint was overtaken by acrylic paint, motion pictures have seen three systemic technological changes: silent film to sound, black and white film to color, photochemical to digital. Each of these changes affected how filmmakers told stories.This talk - with many film clip samples - will attempt to give an overview of post-production technology with special emphasis on how the movement from photochemical to digital has affected the film editing process. In addition, the speaker, a film editor currently working in Hollywood, will describe his fifteen year adventure developing digital tools for film restoration.
Julie Delon (France Telecom R&D, LLC) http://www.tsi.enst.fr/~delon/
Movie and Video Scale-Time Equalization
Image flicker is a general film effect, which can be observed in low sampled videos as well as in old films, and consists of fast variations of the frame contrast and brightness. Reducing flicker of a sequence improves its visual quality and can be an essential first treatment before ulterior manipulations. An axiomatic analysis of the problem leads to a global and generic method of "de-flicker", based on scale-space theory. As a global process, this correction is robust to global noise, shaking and small motion. The scale-time framework leads to simple results of stability, ensures the robustness of the method to blotches or impulse noise, and guarantees that no bias or deviation can appear in time. In cases of a flicker mixing both very local changes and global oscillations, this process can still be used as a first step of deflickering before a more local treatment.
3D models from image sequences
I shall talk about building 3D models from image sequences, and in particular about rendering new views of existing sequences in order to create stereoscopic 3D from monocular footage. I shall show how existing strategies for image-based rendering can be augmented using image-based priors to create realistic 3D views. In addition I will talk about the difficult problem of creating 3D when there is no camera motion.
Getting rid of scratches and blotches
Currently lots of old celluloid movies are digitized to save them from decay. Most of the footage has already suffered from aging or from abrasion and should be processed to improve its quality. In our work we focus on removing scratches and blotches resulting from mechanical damage to the celluloid layer. Bearing in mind that even movies of moderate length consist of several thousands of frames we face two main problems. First, manually highlighting the corrupted pixels is not feasible, they have to be detected automatically. Second, processing time should be kept low. We employ a method based on the optical flow for detection and removal of scratches. Where this method fails we apply a hybrid still image inpainting technique, utilizing PDE inpainting and texture synthesis methods. Due to the use of efficient numerical algorithms an optimized implementation can achieve a processing time of a few ten seconds per frame. Further the algorithm is highly parallelizable except for a single step.
Restoration and zoom of irregularly sampled, blurred and noisy
by accurate total variation minimization with local constraints
We propose an algorithm to solve a problem in image restoration which considers several different aspects of it, namely: irregular sampling, denoising, deconvolution, and zooming. Our algorithm is based on an extension of a previous image denoising algorithm proposed by A. Chambolle using total variation, combined with irregular to regular sampling algorithms proposed by H.G. Feichtinger, K. Gröchenig, M. Rauth and T. Strohmer. Finally we present some experimental results and we compare them with those obtained with the algorithm proposed by K. Gröchenig et al.
Joint work with A. Almansa, V. Caselles and B. Rouge.
Challenging shots: Demo your problem and discuss it with us
This is a moderated session. If anyone has media they can share in advance with other participants, so they can try things ahead of time, the purpose would be to structure a bit fast problem demos on really specific things, providing a context as well to bridge discussions between commercial tools and research. So, in some way, present in 5 minutes or less to a room of experts a real, specific problem.
Additional information and samples available at http://www.revisioneffects.com/IMA/ChallengingImaDemos.htm.
The European project PrestoSpace was started on February 2004. The project aims to provide a complete solution to preserve audiovisual material found in archives (e.g. BBC, RAI or INA). We are presenting a general overview of this project and focus on the restoration task by presenting all partners research topic. Finally, we will present more deeply Joanneum Research activities inside this project.
Anil Kokaram (Trinity College) http://www.mee.tcd.ie/~ack/
Pushing pixels: From restoration to postproduction
Joint with Bill Collis (The Foundry).
The need for the design of new image manipulation tools for both consumer and professional postproduction has substantially widened the breath of research in image/video/vision processing. While machine vision tools have been used successfully by industry for many years, it is only with the success of Digital Television and Digital Media Streaming that more sophisticated moving image processing has shown mainstream success in the worldwide community. This talk tries to chart a course showing how tools in restoration that have been considered for over a decade now, have migrated into the post-production community where they have metamorphosed into other applications. It highlights some emerging trends and tries to explore why researchers and post-production industrialists have become friends.s
Blotch Detection for Digital Archives Restoration
based on the Fusion of Spatial and Temporal Detectors
Joint work with Sorin Tilie (INA) and Isabelle Bloch (ENST).This paper proposes a method based on the Dempster-Shafer evidence theory for the detection of blotches in digitized archive film sequences. The detection scheme relies on the fusion of two uncorrelated fast but inaccurate, spatio-temporal blotch detectors. The imprecision and uncertainty of both detector are modeled using Dempster-Shafer evidence theory, which improves the decision, by taking into account the ignorance and the conflict between detectors.
We found that this combination scheme improves the performance of single blotch detectors, and compares favorably to more complex and time consuming blotch detection methods, for real archive film sequences.
A variational approach to blending based on warping for non-overlapped images
We present a new model for image blending based on warping. The model is represented by partial differential equations (PDEs) and gives a sequence of images, which has the properties of both blending of image intensities and warping of image shapes. We modified the energy functional in the work by Liao et. al. (2002) in order to adapt the idea of the shape warping to the image blending. The PDEs from the proposed energy functional cover not only overlapped images but also non-overlapped ones.
New approaches to line scratch restoration
We consider the problem of detecting and removing line scratches from digital image sequences. In particular, we present an approach based on data fusion techniques for combining relatively well settled distinct techniques. Moreover, focusing on blue scratches, we describe a detection method and a removal method that strongly rely on the specific features of such scratches. Evaluation of the proposed methods and numerical experiments on real images are reported.
Techniques and rationale for motion picture restoration
Motion picture restoration spans the gamut from archival preservation of historically and culturally significant works to pragmatic treatment of low budget titles to extensive polishing of today's blockbusters. Each restoration project has its own idiosyncrasies, including original storage technology, type of damage, and final delivery requirements. Each project needs to strike its own balance between speed and accuracy of processing. Restoration must be approached in a way that addresses the peculiarities and considerations unique to the material.This talk presents some of the prototypical challenges encountered in motion picture restoration. The evolution and consequences of several prevalent storage technologies are discussed. Various sources of image degradation and damage---both common and unusual---are demonstrated. State-of-the-art restorations techniques are presented and critiqued. The requirements and properties of modern delivery mechanisms are explained.
The evolution of film editing technique and its implications to the
parsing and summarization of motion pictures
Techniques employed in film editing have evolved rapidly with increasingly sophisticated and complex methods being used to enhance storytelling. This talk will examine the relationship between scene and shot, picture and sound with a discussion of how an understanding of editing technique can be leveraged to enhance the automated analysis of film content.
About the Speaker: John Mateer joined the University of York in 2001 specifically to design and develop the media production and analysis components of an innovative new teaching and research initiative in Media Technology. His expertise lies in the integration and application of new media technologies in different traditional media production contexts. Prior to this appointment, he worked for over 15 years as a producer, director and production consultant. He is a graduate of NYU's Tisch School of the Arts and AFI's Center for Advanced Film and Television Studies, and is an active member of the Directors Guild of Great Britain.
Tracking planes in images: Applications in post-production
Imagineer Systems Ltd was founded in 2000 with the aim of building innovative products based around computer vision technology. Our first products mokey, has helped to automate various important tasks in film and video post-production, including wire and rig removal, stabilisation, lens distortion correction and matte creation. Two new products, monet and motor, are specialised to compositing and rotoscoping applications. Our core technology is a fast and accurate tracker for affine and projective 2D motion.
In my seminar I shall relate some of the history of the company and summarise the algorithms and software we have developed, in particular our "Gandalf" computer vision library (see gandalf-library.sf.net). There will be extensive demonstrations of mokey and monet. If time permits, I will present a mathematical conundrum in the area of the normalisation of projective quantitities.
Video content analysis
In this work a software framework for processing video data integrating existing open source libraries and a set of applications for video content analysis is presented. These are partial results of an ongoing project.Due to the huge amount of data contained in video sequences, a set of constraints were considered while designing the system to keep the computational cost and memory requirements as low as possible, which led to a simple and effective system architecture. The system also includes a MPEG7-like description module that allows extracting and storing a video content description.Based on this system a set of applications for object extraction, shot detection, and content analysis were developed. These applications were used to test the developed software and to develop new solutions. Novel results for object extraction, shot detection and content description are presented.
Alessandro Rizzi (Università di Milano) http://www.dti.unimi.it/~rizzi/
A human color perception model for film unsupervised digital restoration
Film based media becomes unstable over time, unless they are stored at low temperatures and the humidity is controlled. Some defects, such as bleaching, are difficult to solve using photochemical restoration methods; in such cases, digital restoration can be an alternative solution. The basic idea of the proposed work is to mimic the robust capabilities of the human vision system (HVS) to set up a tool to filter damaged frames in a partially automated way. In fact, film colour cast, caused by ageing, can be considered as generic chromatic noise, thus a colour constancy method can be suitable for restoring it. Moreover a colour constancy method inspired by the HVS behaviour does not need any a-priori information about the colour cast and its magnitude. Another advantage of HVS inspired algorithms is their local effect since film chemical deterioration is usually non-uniform. Several test have been performed with an algorithm called ACE (Automatic Colour Equalization). The technique, presented here, is not just an application of ACE on movie images, but also an enhancement of ACE principles to meet the requirements of digital film restoration practice. The basic ACE computation extracts autonomously the visual content of the frame, correcting colour cast if present and expanding its dynamic range. This behaviour is not always a good restoring solution: there are cases in which the cast has to be maintained (e.g. underwater shots) or the dynamic range has not to be expanded (e.g. sunset or night shots). To this aim, new functions have been added to preserve the natural histogram shape, adding new efficacy in the restoration process. Examples are presented to discuss characteristics, advantages and limits of the use of perceptual models in digital movies colour restoration.s
Daniel Rockmore (Dartmouth College) http://www.cs.dartmouth.edu/~rockmore
Mathematics: Maker and muse for modern art
For all the discussion of right brain/left brain conflict, mathematics and the arts actually have a very healthy relationship, one which can perhaps be traced to their common goal of finding a way to give expression to the grand truths of experience. Mathematician Dan Rockmore will take us on a tour of modern art history and point out some of the surprising ways in which mathematical ideas have been and continue to be an enabler as well as inspiration for some of the big ideas in the visual arts.
Dan Rockmore is a Professor of Mathematics and Computer Science at Dartmouth College, where he has taught since 1991. He recently published Stalking the Riemann Hypothesis : The Quest to Find the Hidden Law of Prime Numbers.
Daniel Rockmore (Dartmouth College) http://www.cs.dartmouth.edu/~rockmore
Math Matters: IMA Public Lecture - Artful mathematics
"But you said it was 99% accurate! Why is there a defect in the first frame?"
Gentle techniques for using mathematics in motion picture processes to marry film and digital
The world of motion picture production and restoration is populated by artists and business people; not mathematicians. Yet, as more and more digital processes, many of which implement non-deterministic algorithms, are introduced into the industry, the probability of error at any stage has increased while the probability of identification and correction has decreased. In addition, many people who came into the industry when silver halide film and photochemistry were the only tools, lack the understanding of digital tools of people newer to the industry and vice versa.I will discuss in detail the mathematical and social solutions developed to solve problems in principal photography and post production related to digital capture defect correction, film image quality control and work flow. I will also detail how a thorough understanding of traditional photochemical film processes allows for the creation of processes with an optimal combination of mathematics, art, analog and digital while, at the same time, educating cinematographers, directors and executives in how to recognize and overcome the complexities of these processes.
Guillermo R. Sapiro (University of Minnesota Twin Cities) http://www.ece.umn.edu/users/guille/
Image and video inpainting
In this talk we will review basic techniques for image inpainting and present new ones for video inpainting under constrained camera motion.
Image-based rendering has been one of the hottest areas in computer graphics in recent years. Instead of using CAD and painting tools to construct graphics models by hand, IBR uses real-world imagery to rapidly create extremely photorealistic shape and appearance models. However, IBR results to date have mostly been restricted to static objects and scenes.Video-based rendering brings the same kind of realism to computer animation, using video instead of still images as the source material. Examples of VBR include facial animation from sample video, repetitive video textures that can be used to animate still scenes and photos, 3D environment walkthroughs built from panoramic video, and 3D video constructed from multiple synchronized cameras. In this talk, I survey a number of such systems developed by our group and by others, and suggest how this kind of approach has the potential to fundamentally transform the production (and consumption) of interactive visual media.About the SpeakerRichard Szeliski leads the Interactive Visual Media Group at Microsoft Research, which does research in digital and computational photography, video scene analysis, 3-D computer vision, and image-based rendering. He received a Ph.D. degree in Computer Science from Carnegie Mellon University, Pittsburgh, in 1988. He joined Microsoft Research in 1995. Prior to Microsoft, he worked at Bell-Northern Research, Schlumberger Palo Alto Research, the Artificial Intelligence Center of SRI International, and the Cambridge Research Lab of Digital Equipment Corporation.Dr. Szeliski has published over 100 research papers in computer vision, computer graphics, medical imaging, and neural nets, as well as the book Bayesian Modeling of Uncertainty in Low-Level Vision. He was a Program Committee Chair for ICCV'2001 and the 1999 Vision Algorithms Workshop, served as an Associate Editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence and on the Editorial Board of the International Journal of Computer Vision, and is a Founding Editor of Foundations and Trends in Computer Graphics and Vision.
Computer vision in movie production: problems and expectations
I shall talk about some practical problems of special effects / animation production that could be broadly defined as Inverse Problems: data is recovered from observed images rather than generated on user specification. Most of Computer Vision tasks in movie production, opposite to most Computer Graphics tasks, fall in Inverse Problems category. The problems I will speak of are: rigid body surveyless, articulated body (as hierarchy of rigid objects), face and flexible surface tracking (markers based, markerless, featureless) and photomodelling including what is practically important for our industry in algorithms development and our expectations. Automatic 3d terrain generation from monocular camera image sequence will be presented as an example of discussed problems and solutions.
Todd Wittman (University of Minnesota Twin Cities) http://www.math.umn.edu/~wittman
A Variational Approach to Image and Video Super-Resolution
Super-resolution seeks to produce a high-resolution image from a set of low-resolution, possibly noisy, images such as in a video sequence. We present a method for combining data from multiple images using the Total Variation (TV) and Mumford-Shah functionals. We discuss the problem of sub-pixel image registration and its effect on the final result.
Andrew Zisserman (University of Oxford) http://www.eng.ox.ac.uk/World/Info/Staff/zisserman.a.html
Retrieving video of people and places
In film editing, or in exploring large video archives, there is a need to access shots by their visual content directly, as textual annotations may not be available. In the first part of this talk I will describe an approach to searching for and localizing all the occurrences of an object in a video. The object is represented by a set of viewpoint invariant fragments that enable recognition to proceed successfully despite changes in viewpoint, illumination and partial occlusion. The fragments act as "visual words" for describing the scene, and by pushing this analogy efficient methods from text retrieval can be employed to retrieve shots in the manner of a Google search of the web.
In the second part of the talk I'll describe progress in searching for people in videos by matching their face. Face recognition is a challenging problem because changes in pose, illumination and expression can exceed those due to identity. Fortunately, video enables multiple examples of each person to be associated automatically using straightforward visual tracking. We demonstratehow these multiple examples can be harnessed to reduce the ambiguity of matching.
The methods will be demonstrated on several feature length films.
This is joint work with Josef Sivic and Mark Everingham.