English / Japanese

: \\fso1.naist.jp\tomoka-s\public_html\sato.JPG

Tomokazu Sato, Ph.D.


Faculty of Data Science, Shiga University

e-mail: t-sato@biwako.shiga-u.ac.jp

March, 1977

Born in Tanabe, Wakayama prefecture, Japan

1995.4 - 1999.3

Bachelor course student
in Osaka Prefecture University, Japan
(Received Bachelor of Engineering in 1999)

1999.4 - 2001.3

Master course student
in Nara Institute of Science and Technology, Japan
(Received Master of Engineering in 2001)

2001.4 - 2003.3

Doctor course student
in Nara Institute of Science and Technology, Japan
(Received Doctor of Engineering in 2003)

2003.4 - 2011.4

Assistant professor of Nara Institute of Science and Technology, Japan

2010.3 - 2011.3

Visiting researcher of CMP in Czech Technical University in Prague

2011.5 - 2017.12

Associate professor of Nara Institute of Science and Technology, Japan

2018.1 -

Professor of Shiga University

Researches interests:

Structure from motion for omni-directional video



Abstract: Multi-camera type of omni-directional camera has advantages of high-resolution and almost uniform resolution for any direction of view. In this research, an extrinsic camera parameter recovery method for a moving omni-directional multi-camera system (OMS) is proposed. First, we discuss a perspective n-point (PnP) problem for an OMS, and then describe a practical method for estimating extrinsic camera parameters from multiple image sequences obtained by an OMS. The proposed method is based on using the shape-from-motion and the PnP techniques.

T. Sato
, S. Ikeda, and N. Yokoya: "Extrinsic camera parameter recovery from multiple image sequences captured by an omni-directional multi-camera system", Proc. European Conf. on Computer Vision (ECCV2004), Vol. 2, pp. 326-340, May 2004. (pdf file)

Depth estimation for omni-directional video


Abstract: This paper proposes a method for estimating depth from long-baseline image sequences captured by a precalibrated moving omni-directional multi-camera system (OMS). Our idea for estimating an omni-directional depth map is very simple; only counting interest points in images is integrated with the framework of conventional multibaseline stereo. Even by a simple algorithm, depth can be determined without computing similarity measures such as SSD and NCC that have been used for traditional stereo matching. The proposed method realizes robust depth estimation against image distortions and occlusions with lower computational cost than traditional multi-baseline stereo method. These advantages of our method are fit for characteristics of omni-directional cameras.

T. Sato
and N. Yokoya: "Efficient hundreds-baseline stereo by counting interest points for moving omni-directional multi-camera system", Journal of Visual Communication and Image Representation, Vol. 21, No. 5-6, pp. 416-426, July 2010. (pdf file)

3D modeling from video images


Abstract: In this paper, we propose a dense 3-D reconstruction method that first estimates extrinsic camera parameters of a hand-held video camera, and then reconstructs a dense 3-D model of a scene. In the first process, extrinsic camera parameters are estimated by tracking a small number of predefined markers of known 3-D positions and natural features automatically. Then, several hundreds dense depth maps obtained by multi-baseline stereo are combined together in a voxel space. We can acquire a dense 3-D model of the outdoor scene accurately by using several hundreds input images captured by a handheld video camera.

T. Sato
, M. Kanbara, N. Yokoya, and H. Takemura: "Dense 3-D reconstruction of an outdoor scene by hundreds-baseline stereo using a hand-held video camera", International Journal of Computer Vision, Vol. 47, No. 1-3, pp. 119-129, April 2002.(pdf file)

Interactive 3D modeling with AR support


Abstract: In most of conventional methods, some skills for adequately controlling the camera movement are needed for users to obtain a good 3-D model. In this study, we propose an interactive 3-D modeling interface in which special skills are not required. This interface consists of gindication of camera movementh and gpreview of reconstruction result.h In experiments for subjective evaluation, we verify the usefulness of the proposed 3D modeling interfaces.

K. Fudono, T. Sato, and N. Yokoya: "Interactive 3-D modeling system using a hand-held video camera", Proc. 14th Scandinavian Conf. on Image Analysis (SCIA2005), pp. 1248-1258, June 2005. (pdf file)

Extrinisc camera parameter estimation using vision and GPS


Abstract: This paper describes a method for estimating extrinsic camera parameters using both feature points on an image sequence and sparse position data acquired by GPS. Our method is based on a structure-from-motion technique but is enhanced by using GPS data so as to minimize accumulative estimation errors. Moreover, the position data are also used to remove mis-tracked features. The proposed method allows us to estimate extrinsic parameters without accumulative errors even from an extremely long image sequence.

Y. Yokochi, S. Ikeda, T. Sato, and N. Yokoya: "Extrinsic camera parameter estimation based-on feature tracking and GPS data", Proc. Asian Conf. on Computer Vision (ACCV2006), Vol. I, pp. 369-378, Jan. 2006. (pdf file)

Realtime image mosaicing


Abstract:This paper presents a real-time video mosaicing system that is one of practical applications of mobile vision. To realize video mosaicing on an actual mobile device, in our method, image features are automatically tracked on the input images and 6-DOF camera motion parameters are estimated with a fast and robust structure-from-motion algorithm. A preview of generating a mosaic image is also rendered in real time to support the user. Our system is basically for the flat targets, but the system also has the capability of 3-D video mosaicing in which an unwrapped mosaic image can be generated from a video image sequence of a curved document.

T. Sato, A. Iketani, S. Ikeda, M. Kanbara, N. Nakajima, and N. Yokoya: "Mobile video mosaicing system for flat and curved documents", Proc. 1st International Workshop on Mobile Vision (IMV2006), pp. 78-92, May 2006. (pdf file)

Feature-landmark based Geometric Registration


Abstract: In this research, extrinsic camera parameters of video images are estimated from correspondences between pre-constructed feature-landmarks and image features. In order to achieve real-time camera parameter estimation, the number of matching candidates are reduced by using priorities of landmarks that are determined from previously captured video sequences.

T. Taketomi, T. Sato, and N. Yokoya: "Real-time camera position and posture estimation using a feature landmark database with priorities", CD-ROM Proc. 19th IAPR Int. Conf. on Pattern Recognition (ICPR2008), Dec. 2008. (pdf file)

Image inpainting using energy function


Abstract: Image inpainting is a technique for removing undesired visual objects in images and filling the missing regions with plausible textures. In this paper, in order to improve the image quality of completed texture, the objective function is extended by allowing brightness changes of sample textures and introducing spatial locality as an additional constraint. The effectiveness of these extensions is successfully demonstrated by applying the proposed method to one hundred images and comparing the results with those obtained by the conventional methods.

N. Kawai, T. Sato, and N. Yokoya: "Image inpainting considering brightness change and spatial locality of textures", CD-ROM Proc. Int. Conf. on Computer Vision Theory and Applications (VISAPP), Vol. 1, pp. 66-73, Jan. 2008. (pdf file)

Inpainting for 3-D model


Abstract: 3D mesh models generated with range scanner or video images often have holes due to many occlusions by other objects and the object itself. This paper proposes a novel method to fill the missing parts in the incomplete models. The missing parts are filled by minimizing the energy function, which is defined based on similarity of local shape between the missing region and the rest of the object. The proposed method can generate complex and consistent shapes in the missing region.

N. Kawai, T. Sato, and N. Yokoya: "Surface completion by minimizing energy based on similarity of shape", Proc. IEEE Int. Conf. on Image Processing (ICIP2008), pp. 1532-1535, Oct. 2008. (pdf file)

Omnidirectional telepresence syetem


Abstract: This paper describes a novel telepresence system which enables users to walk through a photorealistic virtualized environment by actual walking. To realize such a system, a wide-angle high-resolution movie is projected on an immersive multi-screen display to present users the virtualized environments and a treadmill is controlled according to detected userfs locomotion. In this study, we use an omnidirectional multi-camera system to acquire images of a real outdoor scene. The proposed system provides users with rich sense of walking in a remote site.

S. Ikeda, T. Sato, M. Kanbara, and N. Yokoya: "An immersive telepresence system with a locomotion interface using high-resolution omnidirectional movies", Proc. 17th IAPR Int. Conf. on Pattern Recognition (ICPR2004), Vol. IV, pp. 396-399, Aug. 2004. (pdf file)

My doctor's thesis:
"Reconstruction of 3-D Models of Outdoor Scenes Based on Estimating Extrinsic Camera Parameters from Multiple Image Sequences", NAIST-IS-MT9951049, March 2003.

Complete list of published papers:

Click this link to see all publications (Japanese papers are included).