Semantic-based image retrieval for multi-word text queries

Citation

Zand, Mohsen (2015) Semantic-based image retrieval for multi-word text queries. Doctoral thesis, Universiti Putra Malaysia.

Abstract

Catalyzed by the development of digital technologies, the amounts of digital images being produced, archived and transmitted are reaching enormous proportions. It is hence imperative to develop techniques that are able to index,and retrieve relevant images through user‘s information need. Image retrieval based on semantic learning of the image content has become a promising strategy to deal with these aspects recently. With semantic-based image retrieval (SBIR), the real semantic meanings of images are discovered and used to retrieve relevant images to the user query. Thus, digital images are automatically labeled by a set of semantic keywords describing the image content. Similar to the text document retrieval, these keywords are then collectively used to index,organize and locate images of interest from a database. Nevertheless,understanding and discovering the semantics of a visual scene are high-level cognitive tasks and hard to automate, which provide challenging researchop portunities. Specifically, exploiting discriminatory features, handling the visual similarity between object classes and appearance diversity in each class,classification of low-level image visual features to appropriate semantic classes,comprehensively annotate images, and reliable indexing and ranking images through difficult queries are open issues to cope with. This study proposes newideas to overcome these challenges. First, a discriminatory image feature vector is generated using texture as a distinguishable visual feature. In the proposed method, the image texture which is extracted by the Gabor wavelet and the curvelet transforms in the spectral domain is encoded into polynomial coefficients. It not only provides rotation invariant features but also generates texture feature vectors with the maximum power of discrimination. Second, a context-aware and semantic-consistent image descriptor is presented to exploit the image visual attributes in a contextual space. The high-level visual space is constructed by a Dirichlet process regardless of the semantic classes, and then, the posteriors are used to build the contextual space.

Download File

Preview

PDF
FSKTM 2015 19RR.pdf
Download (1MB) | Preview

Additional Metadata

Item Type:	Thesis (Doctoral)
Subject:	Image processing
Subject:	Digital techniques
Call Number:	FSKTM 2015 19
Chairman Supervisor:	Professor Shyamala A/P C. Doraisamy, PhD
Divisions:	Faculty of Computer Science and Information Technology
Depositing User:	Haridan Mohd Jais
Date Deposited:	23 Aug 2017 08:18
Last Modified:	23 Aug 2017 08:18
URI:	http://psasir.upm.edu.my/id/eprint/57134
Statistic Details:	View Download Statistic

Actions (login required)

View Item