About Me

Baoquan Zhao (赵宝全) is currently an associate professor at the School of Artificial Intelligent, Sun Yat-sen University, China. Prior to his current position, he was a Research Fellow under the supervision of Prof. Weisi Lin at the School of Computer Science and Engineering (SCSE), Nanyang Technological University, Singapore, from Sep. 2018 to Jan. 2022. He received his Ph.D. degree in computer science from Sun Yat-sen University, Guangzhou, China, in 2017. From Dec. 2014 to Dec. 2015, he spent one year visiting the Department of Informatics, New Jersey Institute of Technology, USA.
He has been serving as a reviewer of IEEE Transactions on Image Processing, IEEE Transactions on Multimedia, Information Science, Pattern Recognition, Neurocomputing, Signal Processing etc., and was a technical program committee member of IEEE ICME, IEEE ICIP and AMIA. His research interests are in the areas of point cloud processing and compression, visual information analysis, multimedia systems and applications (especially open educational resources), etc.

  zhaobaoquan@mail.sysu.edu.cn        Tangjiawan, Zhuhai, Guangdong, 519082, China

Research Experience

Theme I: 3D Point Cloud Processing and Compression

The wide availability of 3D scanning equipment and ever-growing 3D applications are generating more and more point cloud data at an unprecedented rate, and this poses great challenges for efficient and economic data storage, management, transmission and processing. Recent work under this theme:

  • Image-based point cloud attribute compression
  • Point cloud simplification and mesh reconstruction
  • Deep learning based point cloud processing
  • Point cloud parameterization

Theme II: Open Educational Resource (OER) Retrieval, Analysis and Systems

With recent developments and advances in distance learning and MOOCs, the amount of open educational videos on the Internet has grown dramatically in the past decade. However, most of these videos are lengthy and lack of high-quality indexing and annotations, which triggers an urgent demand for efficient and effective tools that facilitate video content navigation and exploration. Work under this theme:

  • Multi-modal information (including image, speech, text, human action etc.) extraction and fusion
  • Visual-texual educational video summarization
  • Slide-based lecture video content structuring based on multi-modal cues
  • Visual systems for OER video retrival and content exploration

Selected Publications

Point Cloud Processing and Compression

C. Lv, W. Lin, and B. Zhao, Intrinsic and Isotropic Resampling for 3D Point Clouds, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022 (accepted)

C. Lv, W. Lin, and B. Zhao, Approximate Intrinsic Voxel Structure for Point Cloud Simplification, IEEE Transactions on Image Processing, vol. 30, pp. 7241-7255, 2021

[Bibtex]

@article{lv2021approximate,
title={Approximate intrinsic voxel structure for point cloud simplification},
author={Lv, Chenlei and Lin, Weisi and Zhao, Baoquan},
journal={IEEE Transactions on Image Processing},
volume={30},
pages={7241--7255},
year={2021},
publisher={IEEE} }

B. Zhao, W. Lin, and C. Lv, Fine-Grained Patch Segmentation and Rasterization for 3D Point Cloud Attribute Compression, IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 12, pp. 4590-4602, 2021

A complexity-aware heat kernel signature (HKS)-based patch segmentation method is developed to effectively partition a given point into fine-grained patches that are suitable for attribute image generation while well preserving the inherent spatial correlation among points.
A new patch rasterization and rectification method is developed to achieve a balance between assignment energy and intrinsic patch shape preserving.

[Bibtex]

@article{zhao2021fine,
title={Fine-grained patch segmentation and rasterization for 3-d point cloud attribute compression},
author={Zhao, Baoquan and Lin, Weisi and Lv, Chenlei},
journal={IEEE Transactions on Circuits and Systems for Video Technology},
volume={31},
number={12},
pages={4590--4602},
year={2021},
publisher={IEEE}
}

C. Lv, W. Lin, and B. Zhao, Voxel Structure-based Mesh Reconstruction from a 3D Point Cloud, IEEE Transactions on Multimedia, vol. 24, pp. 1815-1829, 2021

A novel voxel structure-based framework was introduced to reconstruct an isotropic mesh from a point cloud keeping important geometric features such as external and internal edges.

[Project page] [Data] [Demo] [Bibtex]

@article{lv2021voxel,
title={Voxel structure-based mesh reconstruction from a 3D point cloud},
author={Lv, Chenlei and Lin, Weisi and Zhao, Baoquan},
journal={IEEE Transactions on Multimedia},
volume={24},
pages={1815--1829},
year={2021},
publisher={IEEE}
}

J. U. Hou, B. Zhao, N. Ansari, and W. Lin, Range Image Based Point Cloud Colorization Using Conditional Generative Model, IEEE International Conference on Image Processing (ICIP), pp. 524-528, 2019

We introduce an automatic colorization scheme based on a deep generative network for 3D point clouds. The proposed approach uses the range images of point could geometry and trains a conditional generative adversarial network to predict the color of those images.

[Bibtex]

@inproceedings{hou2019range,
title={Range Image Based Point Cloud Colorization Using Conditional Generative Model},
author={Hou, Jong-Uk and Zhao, Baoquan and Ansari, Naushad and Lin, Weisi},
booktitle={2019 IEEE International Conference on Image Processing (ICIP)},
pages={524--528},
year={2019},
organization={IEEE}
}

Open Educational Resources Retrieval, Analysis and Systems

C. Xu, W. Jia, R. Wang, Xi. He, B. Zhao, Y. Zhang, Semantic Navigation of PowerPoint-Based Lecture Video for AutoNote Generation, IEEE Transactions on Learning Technologies, 2022 (accepted)

B. Zhao, S. Xu, S. Lin, R. Wang, and X. Luo, A New Visual Interface for Searching and Navigating Slide-Based Lecture Videos, 2019 IEEE International Conference on Multimedia and Expo (ICME), pp. 928-933, 2019

The interface comprehensively derives versatile semantic clues for video content indexing and visual aid generation according to visual elements, text, and mathematical expressions included on lecture slides, speeches recorded, as well as mouse and cursor pointing actions captured during a lecture.

[Demo (MP4, ~60MB)] [Bibtex]

@inproceedings{zhao2019new,
title={A new visual interface for searching and navigating slide-based lecture videos},
author={Zhao, Baoquan and Xu, Songhua and Lin, Shujin and Wang, Ruomei and Luo, Xiaonan},
booktitle={2019 IEEE International Conference on Multimedia and Expo (ICME)},
pages={928--933},
year={2019},
organization={IEEE}
}

C. Xu, R. Wang, S. Lin, X. Luo, B. Zhao, L. Shao, and M. Hu, Lecture2Note: Automatic Generation of Lecture Notes from Slide-Based Educational Videos, 2019 IEEE International Conference on Multimedia and Expo (ICME), pp. 898-903, 2019

Most educational/lecture videos on the internet are lengthy and lack of elaborate annotations. we introduced an novel method to generate note-like video summarizations by establishing the semantic relationship between visual entities in the slide-based lecture video and their descriptive speech texts.

[Bibtex]

@inproceedings{xu2019lecture2note,
title={Lecture2Note: Automatic Generation of Lecture Notes from Slide-Based Educational Videos},
author={Xu, Chengpei and Wang, Ruomei and Lin, Shujin and Luo, Xiaonan and Zhao, Baoquan and Shao, Lijie and Hu, Mengqiu},
booktitle={2019 IEEE International Conference on Multimedia and Expo (ICME)},
pages={898--903},
year={2019},
organization={IEEE}
}

B. Zhao, S. Lin, X. Qi, R. Wang, and X. Luo, A Novel Approach to Automatic Detection of Presentation Slides in Educational Videos, Neural Computing and Applications, vol. 29, no. 5, pp. 1369-1382, 2018

The proposed approach mainly involves five core components: shot boundary detection, training instances collection, shot classification, slide region detection and slide transition detection.

[Bibtex]

@article{zhao2018novel,
title={A novel approach to automatic detection of presentation slides in educational videos},
author={Zhao, Baoquan and Lin, Shujin and Qi, Xin and Wang, Ruomei and Luo, Xiaonan},
journal={Neural Computing and Applications},
volume={29},
number={5},
pages={1369--1382},
year={2018},
publisher={Springer}
}

B. Zhao, S. Lin, X. Luo, S. Xu, and R. Wang, A Novel System for Visual Navigation of Educational Videos Using Multimodal Cues, ACM Multimedia, pp. 1680-1688, 2017

The system tightly integrates multimodal cues obtained from the visual, audio and textual channels of educational videos and presents them with a series of interactive visualization components. With the help of this system, users can explore the educational video content using multiple levels of details to identify content of interest with ease.

[Demo (AVI, ~60MB)] [Bibtex]

@inproceedings{zhao2017novel,
title={A novel system for visual navigation of educational videos using multimodal cues},
author={Zhao, Baoquan and Lin, Shujin and Luo, Xiaonan and Xu, Songhua and Wang, Ruomei},
booktitle={Proceedings of the 25th ACM international conference on Multimedia},
pages={1680--1688},
year={2017}
}

B. Zhao, S. Lin, X. Qi, Z. Zhang, X. Luo, and R. Wang, Automatic Generation of Visual-Textual Web Video Thumbnail, ACM SIGGRAPH ASIA (Posters), pp. 1-2, 2017

We proposed an automatic approach to generate magazine-cover-like thumbnail using the salient visual and textual metadata extracted from video. Compared with traditional snapshot, the synthesized thumbnail is more informative and attractive, which would be helpful for online video selection.

[Bibtex]

@incollection{zhao2017automatic,
title={Automatic generation of visual-textual web video thumbnail},
author={Zhao, Baoquan and Lin, Shujin and Qi, Xin and Zhang, Zhiquan and Luo, Xiaonan and Wang, Ruomei},
booktitle={SIGGRAPH Asia 2017 Posters},
pages={1--2},
year={2017}
}

B. Zhao, S. Xu, S. Lin, X. Luo, and L. Duan,A New Visual Navigation System for Exploring Biomedical Open Educational Resource (OER) Videos, Journal of the American Medical Informatics Association, vol. 23, no. e1, pp. e34-e41, 2016

Biomedical videos as open educational resources (OERs) are increasingly proliferating on the Internet. Unfortunately, seeking personally valuable content from among the vast corpus of quality yet diverse OER videos is nontrivial due to limitations of today's keyword- and content-based video retrieval techniques. To address this need, this study introduces a novel visual navigation system that facilitates users' information seeking from biomedical OER videos in mass quantity by interactively offering visual and textual navigational clues that are both semantically revealing and user-friendly.

[Demo 1: Architecture (MP4, ~18MB)] [Demo 2: System (MP4, ~75MB)] [Bibtex]

@article{zhao2016new,
title={A new visual navigation system for exploring biomedical Open Educational Resource (OER) videos},
author={Zhao, Baoquan and Xu, Songhua and Lin, Shujin and Luo, Xiaonan and Duan, Lian},
journal={Journal of the American Medical Informatics Association},
volume={23},
number={e1},
pages={e34--e41},
year={2016},
publisher={Oxford University Press}
}

Image Quality Assessment

J. Hou, W. Lin, G. Yue, W. Liu, and B. Zhao, Interaction-Matrix Based Personalized Image Aesthetic Assessment, IEEE Transactions on Multimedia, 2022 (Accepted)

[Bibtex]
coming soon...

J. Hou, W. Lin, and B. Zhao, Content-Dependency Reduction with Multi-Task Learning in Blind Stitched Panoramic Image Quality Assessment, IEEE International Conference on Image Processing (ICIP), pp. 3463-3467, 2020

We propose a multi-task learning strategy which encourages learned representation to be less dependent on image content. A siamese network with two weight-shared CNN branches is trained to simultaneously compare the quality of two images of the same scene and predict the quality score of each image.

[Bibtex]

@inproceedings{hou2020content,
title={Content-dependency reduction with multi-task learning in blind stitched panoramic image quality assessment},
author={Hou, Jingwen and Lin, Weisi and Zhao, Baoquan},
booktitle={2020 IEEE International Conference on Image Processing (ICIP)},
pages={3463--3467},
year={2020},
organization={IEEE}
}

Fluid Simulation

F. Wang, S. Xu, D. Jiang, B. Zhao, X. Dai, T. Zhou, and X. Luo, Particle Hydrodynamic Simulation of Thrombus Formation Using Velocity Decay Factor, Computer Methods and Programs in Biomedicine, 2021 (Accepted)

The proposed method for thrombus formation simulation mainly consists of three steps. First, we formulate the formation of thrombus as a particle-based model and obtain the fibrin concentration of the particles with a discretized form of the convection-diffusion-reaction equation; then, we calculate the velocity decay factor using the obtained fibrin concentration. Finally, the formation of thrombus can be simulated by applying the velocity decay factor on particles.

[Bibtex]
@article{wang2021particle,
title={Particle hydrodynamic simulation of thrombus formation using velocity decay factor},
author={Wang, Fei and Xu, Songhua and Jiang, Dazhi and Zhao, Baoquan and Dong, Xiaoqiang and Zhou, Teng and Luo, Xiaonan},
journal={Computer Methods and Programs in Biomedicine},
volume={207},
pages={106173},
year={2021},
publisher={Elsevier}
}

F. Wang, S. Lin, R. Wang, Y. Li, B. Zhao, and X. Luo, Improving Incompressible SPH Simulation Efficiency by Integrating Density-Invariant and Divergence-Free Conditions, ACM SIGGRAPH (Posters), pp. 1-2, 2018

Our method shortens the time of fluid simulation by coupling the two conditions of density-invariant and divergence-free, and achieves the same simulation effect compared with other methods. Further, we regard the displacement of particles as the only basic variable of the continuity equation, which improves the stability of the fluid to a certain extent.

[Demo (MP4, ~70MB)] [Bibtex]

@incollection{wang2018improving,
title={Improving incompressible SPH simulation efficiency by integrating density-invariant and divergence-free conditions},
author={Wang, Fei and Lin, Shujin and Wang, Ruomei and Li, Yi and Zhao, Baoquan and Luo, Xiaonan},
booktitle={ACM SIGGRAPH 2018 Posters},
pages={1--2},
year={2018}
}

Patent

  • B. Zhao, and W. Lin, Image-Based Point Cloud Attribute Compression Using Two-Stage Dimensionality Transformation, Singapore Provisional Patent Application No. 10202008512Q, 2020, PCT application (Filed)
  • X. Qi, S. Lin, and B. Zhao, Content-based Movie Video Processing and Visualization, China Patent, CN106649713B, May, 12, 2020 (Awarded)

Teaching

  • C Programming (2022 Fall)
  • Natural Language Processing (2022 Fall)

Academic Services

Journal Reviewer

  • IEEE Transactions on Image Processing
  • IEEE Transactions on Multimedia
  • Neurocomputing
  • Pattern Recognition
  • Wireless Communications and Mobile Computing
  • IEEE/CAA Journal of Automatica Sinica
  • Signal Processing
  • Frontiers of Computer Science
  • Information Science

Technical Program Committee Member

  • IEEE International Conference on Acoustics, Speech and Signal Processing (2021, 2022)
  • IEEE International Conference on Multimedia and Expo (2019-2022)
    ICME 2020 Outstanding Reviewer Award
  • IEEE International Conference on Image Processing (2020-2022)
  • American Medical Informatics Association Annual Symposium (2017-2021)