Geometric and Illumination Invariant Object Representation: Application to Content-based Image Retrieval

Bibtex entry :

@phdthesis { VG:Sta1998,
    author = { Sergei Startchik },
    title = { Geometric and Illumination Invariant Object Representation: Application to Content-based Image Retrieval },
    school = { University of Geneva },
    year = { 1998 },
    type = { {P}h.{D}. {D}issertation {N}o. 3009 },
    address = { Switzerland },
    month = { July },
    note = { Thesis Jury: Prof. Thierry Pun (Geneva, CH), Prof. Roger Mohr (INP Grenoble, F), Serge Ayer (EPF-Lausanne, CH), Prof. Christian Pellegrini (Geneva, CH) },
    url = { },
    abstract = { This work addresses several issues in the field of computer vision. In particular, attention is focussed on the problem of the representation of an object from its appearance in an image. Several advances are proposed for the representation of planar shapes, which are thus suitable for representing planar and faceted objects. The representation developed is employed for content-based retrieval from an image database. The proposed projectively invariant description for groups of planar disjoint contours as a simultaneous polar reparametrization of multiple curves. Its origin is an invariant point and, for each ray orientation, the cross-ratio of the intersections with the closest curves gives the radius. The sequence of cross-ratio values for all orientations forms a signature. With respect to other methods, this representation is less reliant on individual curve properties, both for the construction of the reference frame and for the calculation of the signature. At the same time, this representation is local and integrates information from multiple curves, guaranteeing robustness to curve discontinuities and partial occlusions. Chromatic information is introduced into the representation and offers two advantages. First, the representation provides a more complete description of the shape and thus becomes more discriminative. Secondly, the chromatic description is illumination invariant under a diagonal chromaticity model and one more acquisition variable is therefore removed. The proposed representation was originally developed for planar shapes, but an extension has been proposed and validated for trihedral corners. [truncated] },
    abstract2 = { A complete system architecture has been implemented, composed of the following stages: feature extraction, reference frame construction, signature evaluation and indexing. The feature extraction stage provides a set of image contours approximated by splines. Joint invariant properties of curves are used to define the center point of the reference frame and the associated rays. Invariant signatures are computed from a combination of local properties of multiple curves. These signatures are used as a multidimensional index into a database of signatures and a subset of plausible object models is thus obtained. The invariant signature method has been used for object representation in the context of content-based retrieval from image databases. In particular, we focus on images which portray man-made objects with planar facets or trihedral corners, which contain trademarks. The database consist of 203 images of 41 such objects. Images were taken from different viewpoints under various illumination conditions. Experimental evaluation has shown that the method is stable to those realistic variations and its performance in this framework is satisfactory. In conclusion, we believe that this approach is an important extension of shape representation methods to a much broader class of objects. },
    vgclass = { thesis },
    vgproject = { cbir },

Keywords: machine learning, information geometry, data mining, Big Data, affective information retrieval (recherche d'information), information visualisation, content-based image and video retrieval (CBIR, CBR, CBVR, CBMR, CBMIR), information mining, classification, multimedia and multimodal information management, semantic web, knowledge base (RDF, OWL, XML, metadata, auto-annotation, description), multimodal information fusion