Ideas
Ideas
Created: 2022-03-28 10:42
#note
My RecSys could work in the following way:
- documents and pictures are embedded using Data2Vec - A General Framework for Self-supervised Learning in Speech, Vision and Language;
- User embedding is obtained using data2vec on the history of liked and/or created posts;
- topic modeling (Top2Vec or BERTopic) is used to extract the most important features of the documents, i.e. the topics. Do the same for the user embedding;
- Extract the recommended posts using FAISS or NMSLib;
- Rank the pictures of the posts.
I need to compare results of two scenarios:
- using topic modeling to reduce number of features;
- without using topic modeling.
References
Code
Tags
#to-do #ideas