Segmentation of illuminated areas of scene using fully-convolutional neural networks and computer vision algorithms for augmented reality systems

Sorokin Maksim Igorevich; Zhdanov Dmitriy Dmitrievich; Potemin Igor' Stanislavovich; Barladyan Boris Haimovich; Bogdanov Nikolay Nikolaevich; Zhdanov Andrey Dmitrievich

doi:doi:10.30987/graphicon-2019-1-42-46

Segmentation of illuminated areas of scene using fully-convolutional neural networks and computer vision algorithms for augmented reality systems

Submit manuscript

To cite

SEGMENTATION OF ILLUMINATED AREAS OF SCENE USING FULLY-CONVOLUTIONAL NEURAL NETWORKS AND COMPUTER VISION ALGORITHMS FOR AUGMENTED REALITY SYSTEMS

Section: INTELLIGENT SOLUTIONS IN COMPUTER GRAPHICS

Proceedings: GRAPHICON'2019 PROCEEDINGS. VOLUME 1

GRNTI 20.01 Общие вопросы информатики GRNTI 50.07 Теоретические основы вычислительной техники

BBK 3297 Вычислительная техника

Sorokin Maksim Igorevich ¹

Zhdanov Dmitriy Dmitrievich ²

Potemin Igor' Stanislavovich ³

Barladyan Boris Haimovich ⁴

Bogdanov Nikolay Nikolaevich ⁵

Zhdanov Andrey Dmitrievich ⁶

Author and publication information

Authors:

1. Universitet ITMO

2. Universitet ITMO

3. Universitet ITMO

4. IPM im. M.V.Keldysha RAN

5. Universitet ITMO

6. Universitet ITMO

Type:

Сonference article

DOI:

https://doi.org/10.30987/graphicon-2019-1-42-46

Published:

20.11.2019

Subject area:

20.01
50.07
3297

Language:

Russian

Keywords:

classification, illumination, convolutional neural networks, segmentation

Abstract and keywords

Abstract (English):
The relevance of this topic is due to the rapid development of virtual and augmented reality systems. The problem lies in the formation of natural conditions for lighting objects of the virtual world in real space. To solve a light sources determination problem and recovering its optical parameters were proposed the fully-convolutional neural network, which allows catching the 'behavior of light' features. The output of FCNN is a segmented image with light levels and its strength. Naturally, the fully-convolutional neural network is well suited for image segmentation, so as an encoder was taken the architecture of VGG-16 with layers that pools and convolves an input image to 1x1 pixel and wisely classifies it to one of a class which characterizes its strength. Neural network training was conducted on 221 train images and 39 validation images with learning rate 1E-2 and 200 epochs, after training the loss was 0,2. As a test was used an ‘intersection over union’ method, that compares the ground truth area of an input image and output image, comparing its pixels and giving the result of accuracy. The mean IoU is 0.7, almost rightly classifying the first class with a value of 90 percents of accordance and the last class with a probability of 30 percents.

Keywords:
classification, illumination, convolutional neural networks, segmentation

References

1. Hold-Geoffroy, Y., Sunkavalli, K., Hadap, S., Gambaretto, E. and Lalonde, J.-F., "Deep outdoor illumination estimation," In Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR) (2017).

2. Lalonde, J.-F., Efros, A. A. and Narasimhan, S. G., "Estimating the natural illumination conditions from a single outdoor image," International Journal of Computer Vision, 98(2), 123-145 (2012).

3. Gardner, M.-A., Sunkavalli, K., Yumer, E., Shen, X., Gambaretto, E., Gagné, C. and Lalonde, J.-F., "Learning to predict indoor illumination from a single image," ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), preprints (2017).

4. Lombardi, S. and Nishino, K., "Reflectance and Illumination Recovery in the Wild," IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 129- 141 (2016).

5. Eigen, D. and Fergus, R., "Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture, " International Conference on Computer Vision (2015).

6. Girshick, R. B., Donahue, J., Darrell, T. and Malik, J., "Rich feature hierarchies for accurate object detection and semantic segmentation," CVPR (2014).

7. Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R. and LeCun, Y., "Overfeat: Integrated recognition, localization and detection using convolutional networks," ICLR (2013).

8. Simonyan, K. and Zisserman, A., "Very deep convolutional networks for large-scale image recognition," CoRR, abs/1409.1556 (2014).

9. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V. and Rabinovich, A., "Going deeper with convolutions," CoRR, abs/1409.4842 (2014).

10. "Lumicept | Integra Inc.," Integra Inc., 2019, <https://integra.jp/en/products/lumicept> (April 12, 2019).

11. D.D. Zhdanov, S.V. Ershov, A.G. Voloboy. Metod podavleniya stohasticheskogo shuma izobrazheniya, sgenerirovannogo Monte-Karlo trassirovkoy luchey, sohranyayuschiy melkie detali // Preprinty IPM im. M.V. Keldysha. 2018. № 194. 15 s.

12. S.V. Ershov, D.D. Zhdanov, A.G. Voloboy. Modifikaciya stohasticheskiy trassirovki luchey dlya snizheniya shuma na diffuznyh poverhnostyah // Preprinty IPM im. M.V. Keldysha. 2018. № 204. 17 s.

13. Heymann, S., Smolic, A., Müller, K., Froehlich, B., "Illumination reconstruction from real-time video for interactive augmented reality," International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), (2005)

14. Bruno Augusto Dorta, M., Rafael Rego, D., Cristina Nader, Vasconcelos., Esteban, C., "Deep light source estimation for mixed reality," VISIGRAPP 2018 - Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, (303-311), (2018)

15. Salma, J., Philippe, R., Eric, M., "Illumination Estimation Using Cast Shadows for Realistic Augmented Reality Applications," Adjunct Proceedings of the 2017 IEEE International Symposium on Mixed and Augmented Reality, ISMAR-Adjunct (2017)

16. Frahm, Jan-Michael., Koeser, K., Grest, D., Koch, R., "Markerless Augmented Reality with Light Source Estimation for Direct Illumination," European Conference on Visual Media Production, (211-220), (2005)

17. Long, J., Shelhamer, E. and Darrell T., "Fully Convolutional Networks for Semantic Segmentation," The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431- 3440 (2015).

18. Xu-yang Wang; Dmitry D. Zhdanov; Igor S. Potemin; Ying Wang and Han Cheng. The efficient model to define a single light source position by use of high dynamic range image of 3D scene // Proc. SPIE 10020, Optoelectronic Imaging and Multimedia Technology IV, 100200I (October 31, 2016)

Submit manuscript

To cite

Citations:

Confirmation

Регистрация