Tuesday, July 30, 2013

We have finished several interesting machine learning projects recently. I have updated  our Cloud Computing Center where you can find more details, other finished projects as well as description of what we work on now.  

First, let me describe new implementation of a Contact-less heart beat measurement on iPad. Jan Plešek has designed an iPad app, which works similarly to Mirror, mirror, tell me who's the most beautiful? You just watch the iPad screen locating your face in a square and in a couple of moment the system tells you a pulse rate estimate. The application is using a unique algorithm for the pulse frequency estimation based on the observed changes of the under the eyes skin color . The human heart pumps the blood and the built in iPad camera can recognize the difference in color during the high an low pressure. To focus to the right place the app must first find face, nose, eyes and then focuses under the eyes. The color changes constitute a time series to which we apply a simple algorithm estimating its base frequency. We have achieved a similar accuracy as the standard heart beat equipment.

The next project was done in cooperation with AVG. Yes, the famous anti-virus company. I bet that on many of your machines, while reading this, is softly humming their Free Antivirus software. Ondrej Pluskal has specialised on the development of Mallware detection, particularly on the algorithm estimating anomalies. The antivirus is continuously watching what is going on on your PC: what files are downloaded, what apps are started, what dll’s are instantiated etc. All this and similar actions are signals forming feature vectors. The task is then simple - identify those signalling an anomaly. To design an algorithm deciding which vectors or situations are perfectly legal from malicious situations requires learning. Taching classifier requires a lot of vectors describing usual situations and also the malicious vectors to discover the difference between them. We have received such a data collected on running PCs from AVG. Our classifier is using the Support Vector Machine (SVM). After lot of work with preprocessing, testing and tuning of the operation point, we have delivered a new classification vector improving the performance. We are hoping our solution will get soon to the product.

Third, Tonda Novák was involved with the Design of Probabilistic Models for Text Input Correction project. It was a great opportunity to explore the learning to rank algorithms capabilities, which became the core of the solution. What is it doing? When entering a query to a search engine users are making mistakes and the engine needs to correct them before starting the search. To find what the user really wants to enter is not a simple task. They are entering many more different words than we can find in a standard dictionary, many words are appearing or are being created on everyday basis, some of the works are showing in multiple words phrases, some users do not know the spelling but know the phonetic version. We have collected all this information to guess what the user wants to type. Of course, there are multiple choices for each word. The problem is to put all the information together and run an algorithm deciding what is the most likely word or phrase the users is about to enter. Tonda has decided to use the pair-wise learning to rank algorithm. It outputs a list of ranked corrected queries. It is a supervised algorithm requiring a learning set. We have used a corpus of queries from the seznam.cz search engine. Jointly with seznam.cz we set up a testing server to run tests on unknown test set to measure the accuracy. Our machine works quite well. The proof: parts of our algorithm are already in the product version of seznam.cz query correction. Hooray!

I have chosen these three examples of our projects to show how we want to progress in the future. It is simple, we want to focus on machine learning algorithms in practical applications. Most of students want to work on real industry problems. They would like to get practical experience before leaving the school, cooperate with a company to try how it feels working for them. This is also the best form of a cooperation for the faculty, because we can apply the latest research results and confront them with the industry. \It is not easy to find the right projects and put all the required ingredients together. It requires people with vision and empathy on both sides in the industry and academia too. Not all companies are ready, not all companies compete delivering better, and technologically more advanced solutions. At the university it is a never ending process searching for the best partners, with leaders interested in innovation, bringing their customers the best solutions. We are looking for more partners with clear technological vision to help them solving the most challenging problems in AI, machine learning and computer science. If you know suitable company let me know.