www.bifa88.com-88bifa必发官网-bifa88

Current Position: Home > Excellent Results Demonstration

Excellent Ph.D.| Wang Yashen: Research on Vectorization Representation and Retrieval Based on Conceptual Short Text

Name : Wang Yashen                  

Major: Computer Science and Technology

College: School of Computer Science      

Supervisor: Huang Heyan

Mode: Bachelor-straight-to-Doctorate student 

Dissertation Topic: Research on Vectorization Representation and Retrieval Based on Conceptual Short Text

1. the Main Content and Innovation of the Doctoral Dissertation

Award-winning society: China Artificial Intelligence Society

 

research content

For the short-text representation of social media, short text clustering and microblog retrieval, the short-text representation modeling is faced with the problem of semantic expression and versatility. The main research contents include:

1) Short text conceptualization scheme supporting heterogeneous semantic association collaborative modeling reasoning.

2) Support short text vectorization schemes that simulate human reading attention habits.

 

Innovation

1) Innovatively make full use of multiple types of semantic associations between concepts and words, and propose a short text conceptualization algorithm based on Co-Ranking framework. Overcoming the shortcomings of previous studies that could not fully integrate multiple types of associations. And for the first time in this research direction, collaborative extraction of contextual context keywords is achieved.

2) Innovatively introduce conceptual information with higher semantic level and attention mechanism based on human reading habits into the research of short text vectorization. A conceptual sentence embedding model based on attention mechanism is proposed, which significantly enhances the semantic expression ability of the generated short text vector and the ability to discriminate the "word polysemy" phenomenon. Get rid of the limitations of specific areas and specific application tasks, making the model more general and versatile.

3) Innovatively integrate the short text conceptualization and short text vectorization results into the microblog retrieval query expansion research. A conceptual feedback model for microblog retrieval query extension is proposed to effectively filter the noise in the pseudo-correlation feedback document and improve the quality of the extended words. Therefore, the problem of "word table mismatch" and insufficient input signal in the microblog retrieval task are alleviated.

 

2. the Iconic Results

Academic pape:

1)Huang H, Wang Y, Feng C, et al. Leveraging Conceptualization for Short-Text Embedding[J]. IEEE Transactions on Knowledge & Data Engineering, 2018, 30(7): 1282-1295.(CCF, SCI Zone 2, IF = 2.775)

 

2)Wang Y, Huang H, Feng C, et al. Community Detection Based on Minimum-Cut Graph Partitioning[100]// 16th International Conference on Web-Age Information Management (WAIM 2015). Springer International Publishing, 2015:57-69.(CCF recommends Class C meeting)

 

3)Wang Y, Huang H, Feng C, et al. CSE: Conceptual Sentence Embeddings based on Attention Model[100]// 54th Meeting of the Association for Computational Linguistics (ACL 2016). 2016:505-515.(CCF recommends Class A meeting, EI included)

 

4)Wang Y, Huang H, Feng C. Query Expansion Based on a Feedback Concept Model for Microblog Retrieval[100]// 26th International Conference on World Wide Web (WWW 2017). International World Wide Web Conferences Steering Committee, 2017:559-568.(CCF recommends Class A meeting)

 

5)Wang Y, Huang H, Feng C, et al. Conceptual Sentence Embeddings[100]// 17th International Conference on Web-Age Information Management (WAIM 2016). Springer International Publishing, 2016: 390-401.(CCF recommends Class C meeting)

 

Award

2018.11. Received the Outstanding Doctoral Dissertation Award of the 2018 Chinese Artificial Intelligence Society.

 

2018.6 Month Awarded the Outstanding Doctoral Dissertation Award of Beijing Institute of Technology in 2018.