Main contents

Publications

Conference and journal publications

2012

  • Auvinen H, Raitio T, Siltanen S, Alku P: Utilizing Markov Chain Monte Carlo (MCMC) method for improved glottal inverse filtering. Proceedings of Interspeech’12, 2012.
  • Gamper H., Dicke C., Billinghurst M. and Puolamäki K: Sound sample detection and numerosity estimation using auditory display. ACM Transactions on Applied Perception, 2013 (accepted for publication).
  • Hirvenkari L, Ruusuvuori J, Saarinen V-M, Kivioja M, Peräkylä A and Hari R: Influence of turn-taking in a two-person conversation on the gaze of a viewer synchrony during viewing turn-taking in a two-person conversation: An eye tracking study. Under revision.
  • Jokinen E, Yrttiaho S, Pulakka H, Vainio M, Alku P: Signal-to-noise ratio adaptive post-filtering method for intelligibility enhancement of telephone speech. Journal of the Acoustical Society of America, 132(6), pp. 3990-4001, 2012.
  • Kandemir M, Klami A, Vetek A, and Kaski S: Unsupervised inference of auditory attention from biosensors. Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2012), pp. 403–418, 2012.
  • Kandemir M and Kaski S: Learning relevance from natural eye movements in pervasive interfaces. Proceedings of the International Conference on Multimodal Interaction (ICMI), pp. 85–82, 2012.
  • Kayal S: Experiments on the LFW database using curvelet transforms and a random forest-kNN cascade. Proceedings of the 2nd International Conference on Second International Conference on Digital Information Processing and Communications. 2012.
  • Koskinen M, Viinikanoja J, Kurimo M, Klami A, Kaski S and Hari R: Identifying fragments of natural speech from the listener’s MEG signals. Human Brain Mapping, 2012, doi: 10.1002/hbm.22004.
  • Kujala MN, Carlson S and Hari R: Engagement of amygdala in third-person view of face-to-face interaction. Human Brain Mapping 2012, 33, pp. 1753–1762.
  • Kujala MN, Kujala J, Carlson S and Hari R: Brain correlates of observing interaction between dogs: Effects of expertise. PloS One 2012, 7(6): e39145.
  • Kytö M, Hakala J, Oittinen P, and Häkkinen J: Effect of camera separation on the viewing experience of stereoscopic photographs. Journal of Electronic Imaging, 21(1), pp. 011011:1-011011:9, 2012, doi:10.1117/1.JEI.21.1.011011.
  • Kytö M, Mäkinen A, Häkkinen J, and Oittinen P: Improving Relative Depth Judgments in Augmented Reality with Auxiliary Augmentations. ACM Transactions on Applied Perception, 2012, accepted.
  • Lindeman R. W., Lee G., Beattie L., Gamper H., Pathinarupothi R. and Akhilesh A: GeoBoids: A Mobile AR Application for Exergaming. Proceedings of International Symposium on Mixed and Augmented Reality (ISMAR), 2012
  • Lorenzo-Trueba J, Barra-Chicote R, Raitio T, Obin N, Alku P, Yamagishi J, Montero JM: Towards glottal source controllability in expressive speech synthesis. Proceedings of Interspeech’12, 2012.
  • Mansikkaniemi A and Kurimo M: Adaptation of morpheme-based speech recognition for foreign entity names. Proceedings of the Fifth International Conference Human Language Technologies - The Baltic Perspective, 2012.
  • Mansikkaniemi A and Kurimo M: Unsupervised vocabulary adaptation for morph-based language models. Proceedings of the NAACL 2012 Workshop on the Future of Language Modeling for HLT, 2012.
  • Pulakka H, Laaksonen L, Myllylä V, Yrttiaho S, Alku P: Conversational evaluation of speech bandwidth extension using a mobile handset. IEEE Signal Processing Letters, 19(4), pp. 203-206, 2012.
  • Pulakka H, Laaksonen L, Myllylä V, Yrttiaho S, Alku P: Conversational evaluation of artificial bandwidth extension of telephone speech using a mobile handset. Proceedings of IEEE Int. Conference on Acoustics, Speech, and Signal Processing (ICASSP’12), 2012.
  • Pulakka H, Remes U, Yrttiaho S, Palomäki K, Kurimo M, Alku P: Bandwidth extension of telephone speech to low frequencies using sinusoidal synthesis and Gaussian mixture model. IEEE Transactions on Audio, Speech, and Language Processing, 20(8), pp. 2219-2231, 2012.
  • Pulakka H, Laaksonen L, Yrttiaho S, Myllylä V, Alku P: Conversational quality evaluation of artificial bandwidth extension of telephone speech. Journal of the Acoustical Society of America, 132(2), pp. 848-861, 2012.
  • Raitio T, Takanen M, Santala O, Suni A, Vainio M, Alku P: On measuring the intelligibility of synthetic speech in noise – Do we need a realistic noise environment? Proceedings of IEEE Int. Conference on Acoustics, Speech, and Signal Processing (ICASSP’12), 2012.
  • Raitio T, Suni A, Vainio M, Alku P: Wideband parametric speech synthesis using warped linear prediction. Proceedings of Interspeech’12, 2012.
  • Sjöberg M, Koskela M, Ishikawa S, Laaksonen J: Real-Time Large-Scale Visual Concept Detection with Linear Classifiers. Proceedings of 21st International Conference on Pattern Recognition. 2012.
  • Sjöberg M, Koskela M, Ishikawa S, Laaksonen J, Oja E: PicSOM Experiments in TRECVID 2012. Proceedings of the TRECVID 2012 Workshop. 2012.
  • Suni A, Raitio T, Vainio M, Alku P: The GlottHMM Entry for Blizzard Challenge 2012: Hybrid Approach. Proceedings of the Blizzard 2012 Workshop, 2012.

2011

  • Ajanki A, Billinghurst M, Gamper H, Järvenpää T, Kandemir M, Kaski S, Koskela M, Kurimo M, Laaksonen J, Puolamäki K, Ruokolainen T and Tossavainen T: An Augmented Reality Interface to Contextual Information. Virtual Reality, 15:161-173, 2011. (pre-print pdf) pdf
  • Ajanki A and Kaski S: Probabilistic proactive timeline browser. In Proceedings of the 21st International Conference on Artificial Neural Networks (ICANN), Part II, pp. 357-364, 2011. (doi)
  • Gamper H, Dicke C, and Puolamäki K: Design guidelines for auditory display of points of interest. ACM Transactions on Applied Perception (TAP), pp. 1-16, 2011-2012 (submitted).
  • Gamper H and Lokki T: Spatialisation in audio augmented reality using finger snaps, in Principles and Applications of Spatial Hearing. 2011, pp. 383-392, World Scientific Publishing, eds. Yoiti Suzuki and Douglas Brungart and Hiroaki Kato.
  • Gamper H, Tervo S, and Lokki T: Head orientation tracking using binaural headset microphones. In 131st AES Convention, New York, NY, USA, Oct. 2011, pp. 1-7.
  • Kafentzis G, Stylianou Y, Alku P: Glottal inverse filtering using Stablilised Weighted Linear Prediction. Proc. of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP’11), Prague, Czech Republic, May 22-27, 2011.
  • Kytö M, Nuutinen M, and Oittinen P: Method for measuring stereo camera depth accuracy based on stereoscopic vision, Three-Dimensional Imaging, Interaction and Measurement, Proc. SPIE vol 7864, pp. 78640I-78640I-9., 2011. (doi)
  • Kytö M and Hakala J: Geometric and subjective analysis of stereoscopic I3A cluster images, Stereoscopic Displays and Applications, Proc. SPIE vol. 7863, 2011. (doi)
  • Kytö M, Hakala J, Oittinen P and Häkkinen J: Effect of camera separation on the viewing experience of stereoscopic photographs. Journal of Electronic Imaging, 21, 1, 011011–011011–9, 2012.
  • Kytö M, Häkkinen J, Oittinen P: Stereoscopic viewing facilitates the perception of crowds. IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS), Klagenfurt, Austria, 30.8-2.9.2011. pp. 49-53.
  • Pihko E, Virtanen A, Saarinen V-M, Hirvenkari L, Tossavainen T, Haapala A and Hari R: Influence of expertise and the painting’s abstraction level on experiencing art. Frontiers in Human Neuroscience 2011; 5: Article 94 (10 pages)
  • Pohjalainen J, Raitio T, Alku P: Detection of shouted speech in the presence of ambient noise. Proc. of Interspeech’11, Florence, Italy, Aug. 28-31, 2011.
  • Pulakka H, Alku P: Bandwidth extension of telephone speech using a neural network and a filterbank implementation for highband mel spectrum. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, No. 7, pp. 2170-2183, 2011.
  • Pulakka H, Remes U, Palomäki K, Kurimo M, Alku P: Speech bandwidth extension using Gaussian Mixture Model-based estimation of the highband Mel spectrum. Proc. of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP’11), Prague, Czech Republic, May 22-27, 2011.
  • Pulakka H, Remes U, Yrttiaho S, Palomäki K, Kurimo M, Alku P: Low-frequency bandwidth extension of telephone speech using sinusoidal synthesis and Gaussian mixture model. Proc. of Interspeech’11, Florence, Italy, Aug. 28-31, 2011.
  • Raitio T, Suni A, Pulakka H, Vainio M, Alku P: Utilizing glottal source pulse library for generating improved excitation signal for HMM-based speech synthesis. Proc. of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP’11), Prague, Czech Republic, May 22-27, 2011.
  • Raitio T, Suni A, Vainio M, Alku P: Analysis of HMM-based Lombard speech synthesis. Proc. of Interspeech’11, Florence, Italy, Aug. 28-31, 2011.
  • Raitio T, Suni A, Yamagishi J, Pulakka H, Nurminen J, Vainio M, Alku P: HMM-based speech synthesis utilizing glottal inverse filtering. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, No. 1, pp. 153-165, 2011.
  • Schürmann M, Hlushchuk Y and Hari R: Embodied visual perception of distorted finger postures. Human Brain Mapp 2011, 32: 612--623.
  • Sousa R, Ferreira A, Alku P: Estimation of harmonic noise components of the glottal excitation. Proc. of the 7th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA’11), Florence, Italy, Aug. 25-27, 2011.
  • Suni A, Raitio T, Vainio M, Alku P: The GlottHMM entry for Blizzard Challenge 2011: Utilizing source unit selection in HMM-based speech synthesis for improved excitation generation. Proc. of the ISCA Blizzard Challenge 2011 Workshop, Turin, Italy, Sept. 2, 2011.
  • Zhang H, Ruokolainen T, Laaksonen J, Hochleitner C, and Traunmüller R: Gaze- and speech-enhanced content-based image retrieval in image tagging. In Proc. of 21st International Conference on Artificial Neural Networks (ICANN 2011), Espoo, Finland, 2011.

2010

  • A. Ajanki, M. Billinghurst, M. Kandemir, S. Kaski, M. Koskela, M. Kurimo, J. Laaksonen, K. Puolamäki and T. Tossavainen: Ubiquitous Contextual Information Access with Proactive Retrieval and Augmentation. In Proceedings of The fourth International Workshop on Ubiquitous Virtual Reality (IWUVR), Helsinki, Finland, 2010. (pdf) pdf
  • A. Ajanki, M. Billinghurst, H. Gamper, T. Järvenpää, M. Kandemir, S. Kaski, M. Koskela, M. Kurimo, J. Laaksonen, K. Puolamäki, T. Ruokolainen and T. Tossavainen: Contextual Information Access with Augmented Reality. In Proceedings of IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 95-100, Kittilä, Finland, 2010. (doi)
  • M. Kandemir, V-M. Saarinen, and S. Kaski: Inferring object relevance from gaze in dynamic scenes. In Proc. ETRA 2010, Eye Tracking Research & Applications, pp. 105-108, 2010. (doi)
  • L. Hirvenkari, J. Ruusuvuori, V-M. Saarinen, M. Kivioja, A. Peräkylä and R. Hari: Intersubject synchrony during viewing turn-taking in a two-person conversation: An eye tracking study. Submitted for publication.
  • M. Kaksonen, V-M. Saarinen, P. Vuilleumier, and R. Hari: Viewing neutral and fearful faces: Effect of spatial frequency on eye movement patterns. Under preparation.
  • M. N. Kujala, S. Carlson, and R. Hari: Engagement of amygdala in third-person view of face-to-face interaction. Human Brain Mapp, under revision.
  • M. N. Kujala, J. Kujala, S. Carlson and R. Hari: Experience influences brain responses to reading social interaction of dogs. Under preparation.
  • T. Raitio, A. Suni, J. Yamagishi, H. Pulakka, J. Nurminen, M. Vainio, P. Alku: HMM-based speech synthesis utilizing glottal inverse filtering. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, No. 1, pp. 153-165, 2010. (doi)
  • T. Raitio, A. Suni, H. Pulakka, M. Vainio, P. Alku: Comparison of formant enhancement methods for HMM-based speech synthesis. In CD Proceedings of the 7th ISCA Speech Synthesis Workshop, Kyoto, Japan, Sept. 22-24, 2010.
  • M. Schürmann, Y. Hlushchuk and R. Hari: Embodied visual perception of distorted finger postures. Human Brain Mapp, 2010. (doi)
  • M. Sjöberg, M. Koskela, V. Viitaniemi, and J. Laaksonen: PicSOM experiments in ImageCLEF RobotVision. In Devrim Ünay, Zehra Çataltepe, and Selim Aksoy, editors, Recognizing Patterns in Signals, Speech, Images and Videos, volume 6388 of Lecture Notes in Computer Science, pages 190–199, Istanbul, Turkey, August 2010. Springer Berlin / Heidelberg. (doi)
  • M. Sjöberg, M. Koskela, V. Viitaniemi, and J. Laaksonen: Indoor location recognition using fusion of SVM-based visual classifiers. In Proceedings of 2010 IEEE International Workshop on Machine Learning for Signal Processing, Kittilä, Finland, August-September 2010. (doi)
  • A. Suni, T. Raitio, M. Vainio, P. Alku: The GlottHMM speech synthesis entry for Blizzard Challenge 2010. In CD Proceedings of the Blizzard Challenge 2010 Workshop, Kyoto, Japan, Sept. 25, 2010. (pdf) pdf
  • T. Tossavainen: Approximate and SQP Two View Triangulation. 10th Asian Conference on Computer Vision, Queenstown, New Zealand, 8-12.11.2010.

2009

  • A. Ajanki, D. R. Hardoon, S. Kaski, K. Puolamäki, and J. Shawe-Taylor: Can eyes reveal interest? - Implicit queries from gaze patterns. User Modeling and User-Adapted Interaction: The Journal of Personalization Research, 19:307-339, 2009. (doi)
  • M. Hiipakka, M. Tikander, and M. Karjalainen: Modeling of external ear acoustics for insert headphone usage. In Proc. 126th AES Convention, Munich, Germany, May 7-10, 2009. (pdf) pdf
  • M. Hiipakka, M. Tikander, and M. Karjalainen: Modeling of external ear acoustics for insert headphone usage. Invited and submitted to Journal of the Audio Engineering Society (JAES).
  • L. Hirvenkari, V. Jousmäki, S. Lamminmäki, V.-M. Saarinen, M. Sams and R. Hari: Gaze-based MEG averaging during audiovisual speech perception. Frontierers in Human Neuroscience, in press.
  • L. Hirvenkari, V. Jousmäki, S. Lamminmäki, V.-M. Saarinen, M. Sams and R. Hari: Brain correlates of audiovisual speech perception during natural viewing. Submitted.
  • L. Hirvenkari, J. Ruusuvuori, V.-M. Saarinen, H. Laaksonen, M. Kivioja, A. Peräkylä and R. Hari: Puheenvuoron vaihtojen seuraaminen kahden henkilön keskustelussa: Silmänliikeanalyysi. Kahdeksannet Keskusteluntutkimuksen Päivät, Turku 2009. (abstract).
  • M. Koskela, M. Sjöberg, and J. Laaksonen: Improving automatic video retrieval with semantic concept detection. In Proc. 16th Scandinavian Conference on Image Analysis (SCIA 2009), Oslo, Norway, 2009. (pdf) pdf
  • L. Kozma, A. Klami, and S. Kaski: GaZIR: Gaze-based zooming interface for image retrieval. In Proc. ICMI-MLMI 2009, The Eleventh International Conference on Multimodal Interfaces and The Sixth Workshop on Machine Learning for Multimodal Interaction, pages 305 - 312, New York, NY, USA, 2009. ACM. (pdf) pdf
  • M. N. Kujala, S. Carlson, and R. Hari: Dog experts read body postures across species: Brain basis of interpreting human or canine social interaction. Submitted.
  • K. Pasupa, C. Saunders, S. Szedmak, A. Klami, S. Kaski, and S. Gunn: Learning to rank images from eye movements. In IEEE International Workshop on Human-Computer Interaction (HCI2009), pp. 2009-2016, October 4, 2009, Kyoto, Japan. (doi)
  • M. Tikander: Usability issues in listening to natural sounds with an augmented reality audio headset. Journal of the Audio Engineering Society (JAES), 57: 430–441, 2009.

2008

  • K. Puolamäki, A. Ajanki, and S. Kaski: Learning to Learn Implicit Queries from Gaze Patterns. In Proceedings of International Conference on Machine Learning (ICML2008), pages 760-767, 2008. (pdf) pdf
  • M. Tikander, M. Karjalainen, and V. Riikonen: An Augmented Reality Audio Headset. In International Conference on Digital Audio Effects (DAFx-08), Espoo, Finland, Sept. 2008. (pdf) pdf

Theses

Dissertations

  • M. Tikander: Development and Evaluation of Augmented Reality Audio Systems. Doctoral thesis. Report no. 13 / Helsinki University of Technology TKK, Department of Signal Processing and Acoustics. Espoo, Finland, November 2009.

Master's Theses

  • T. Günther: Developing a Context-Aware Mobile Augmented Reality Application. MSc Thesis, Aalto University School of Science and Technology, 2010.
  • A Järvinen: Rakennusten visualisointi lisätyn todellisuuden avulla, MSc Thesis, Aalto University School of Science and Technology, 2010.
  • L. Kozma: A proactive interface for image retrieval. MSc Thesis, Department of Information and Computer Science, Helsinki University of Technology, May 2009.
  • M. Kytö: Evaluation method for stereo camera accuracy in augmented reality. MSc Thesis, Department of Media Technology, Helsinki University of Technology, March 2009.
  • S. Lavinen: Augmented Reality Concepts for Urban Planning. MSc Thesis, Aalto University School of Engineering, 2010.
  • T. Ruokolainen: Topic adaptation for speech recognition in multimodal environment. MSc Thesis, Helsinki University of Technology, October 2009.
  • J. Wu: Online Face Recognition with Application to Proactive Augmented Reality. MSc Thesis, Aalto University School of Science and Technology, 2010.

Related publications in earlier projects