Abstract
Multimedia content is usually complex and may contain many semantically meaningful elements interrelated to each other. Therefore to understand the high-level semantic meanings of the content, such interrelations need to be learned and exploited to further improve the search process. We introduce our ideas on how to enable automatic construction of semantic context by learning from the content. Depending on the targeted source of content, representation schemes for its semantic context can be constructed by learning from data. In the target representation scheme, metadata is divided into three levels: low, mid, and high levels. By using the proposed scheme, high-level features are derived out of the mid-level features. In order to explore the hidden interrelationships between mid-level and the high-level terms, a Bayesian network model is built using from a small amount of training data. Semantic inference and reasoning is then performed based on the model to decide the relevance of a video.
Chapter PDF
Similar content being viewed by others
References
Boutell, M., Luo, J.: Beyond pixels: Exploiting camera metadata for photo classification. Pattern recognition 38(6), 935–946 (2005)
Bradshaw, B.: Semantic based image retrieval: a probabilistic approach. In: Proceedings of the eighth ACM international conference on Multimedia, pp. 167–176 (2000)
Brand, M., Oliver, N., Pentland, A.: Coupled hidden markov models for complex action recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 994–999 (1997)
Chang, E., Goh, K., Sychay, G., Wu, G.: Cbsa: content-based soft annotation for multimodal image retrieval using bayes point machines. IEEE Transactions on Circuits and Systems for Video Technology 13(1), 26–38 (2003)
Chen, W., Chang, S.F.: Generating semantic visual templates for video databases. In: 2000 IEEE International Conference on Multimedia and Expo, 2000. ICME 2000, vol. 3 (2000)
De Jong, F.M.G., Westerveld, T., De Vries, A.P.: Multimedia search without visual analysis: the value of linguistic and contextual information. IEEE Transactions on Circuits and Systems for Video Technology 17(3), 365–371 (2007)
Eakins, J.P., Graham, M.E.: Content-based image retrieval: A report to the jisc technology applications programme. Tech. rep., Institute for Image Data Research, University of Northumbria at Newcastle (1999), http://www.jisc.ac.uk/uploaded_documents/jtap-039.doc
Cooper, G.F., Herskovits, E.: A bayesian method for the induction of probabilistic networks from data. Machine learning 9(4), 309–347 (1992)
Fan, J., Gao, Y., Luo, H., Jain, R.: Mining multilevel image semantics via hierarchical classification. IEEE Transactions on Multimedia 10(2), 167–187 (2008)
Fei-Fei, L., Fergus, R., Perona, P.: A bayesian approach to unsupervised one-shot learning of object categories. In: Proc. ICCV, vol. (2003)
Fergus, R., Perona, P., A., Zisserman, o.: Object class recognition by unsupervised scale-invariant learning. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2 (2003)
Hoiem, D., Sukthankar, R., Schneiderman, H., Huston, L.: Object-based image retrieval using the statistical structure of images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2 (2004)
Kherfi, M.L., Ziou, D.: Image collection organization and its application to indexing, browsing, summarization, and semantic retrieval. IEEE Transactions on multimedia 9(4), 893–900 (2007)
Koskela, M., Smeaton, A.F., Laaksonen, J.: Measuring concept similarities in multimedia ontologies: Analysis and evaluations. IEEE Transactions on Multimedia 9(5), 912–922 (2007)
Lavrenko, V., Feng, S., Manmatha, R.: Statistical models for automatic video annotation and retrieval. In: IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP’04, vol. 3, IEEE Computer Society Press, Los Alamitos (2004)
Naphade, M.R., Huang, T.S.: A probabilistic framework for semantic video indexing, filtering, and retrieval. IEEE Transactions on Multimedia 3(1), 141–151 (2001)
Naphade, M.R., Huang, T.S.: Extracting semantics from audio-visual content: the final frontier in multimedia retrieval. IEEE Transactions on Neural Networks 13(4), 793–810 (2002)
Qian, R., Haering, N., Sezan, I.: A computational approach to semantic event detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 200–206 (1999)
Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Transactions on pattern analysis and machine intelligence 22(12), 1349–1380 (2000)
Vailaya, A., Figueiredo, M.A.T., Jain, A.K., Zhang, H.J.: Image classification for content-based indexing. IEEE Transactions on Image Processing 10(1), 117–130 (2001)
Zhu, X., Wu, X., Elmagarmid, A.K., Feng, Z., Wu, L.: Video data mining: Semantic indexing and event detection from the association perspective. IEEE Transactions on Knowledge and Data engineering, 665–677 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2011 The Author(s)
About this paper
Cite this paper
Zhang, Q., Izquierdo, E. (2011). Semantic Context Inference in Multimedia Search. In: Domingue, J., et al. The Future Internet. FIA 2011. Lecture Notes in Computer Science, vol 6656. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20898-0_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-20898-0_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20897-3
Online ISBN: 978-3-642-20898-0
eBook Packages: Computer ScienceComputer Science (R0)