The preliminary listing the projects is quite large. Before you try to select your favorite project you may read this blog to get familiar with the categories. Of course we are open to discuss your own ideas too, just talk to us.
The first category is the Information Retrieval (IR), which contains lot of small problems. All of them are focusing on full text search. For example, learning to rank, interleaving and modeling are contributing to improving the SERP. The rest of the tasks in this category are trying to solve problems related to query expansion.
The Semantic Search section is formulating several problems related to future search engines, which will be able to answer complicated questions such as “How old is Tom Cruise?”. To prepare the right answer we need to be able to find the entities in text or in a query.
We also offer topics in the Ad Selection category. The problem is simple to formulate, but not easy to implement. We need to place to SERP or on a customer web page an ad, which would be most attractive for users. We want to maximize the number of users clicks on an ad, to maximize the income.
The classical problem in machine learning is Spam Detection. The fight still continues in new more twisty incarnations. The problems in this category are focusing on these new forms o f spam.
We have also a large range of problems in the Recommendation System category. The best example? Everybody from the field knows the Netflix Competition with the astonishing first price of 1 million USD. Large number of teams tried to recommend the best movies to viewers. Recommendation is very important for all Internet shop and can help dramatically increase the revenue.
Many applications on today Internet are Mashups and this is the next category. They are put together from simple web services with REST API. Imagine for example an accounting system comparing your performance with similar companies. Another segment in this category is the vertical search. These services can be implemented for example with open source packages such as Lucene and Solr.
If you feel the task description is difficult to understand, difficult to imagine what is the particular problem about and why it is listed, do not be frightened, we will help you to get through it. We will soon announce a meeting to answer your questions and help you with selection. This will be also the ideal place to discuss your own ideas. Follow us on twitter #CTUESC or Facebook. If you want to sign up here is the entry form.