Thursday, February 28, 2013

Machine learning for Internet

I have a passion for eClub and startups, but I also do research with my students. I would like to share with you some thoughts about our strategy.

I have a team of doctoral, MSc and BSc students working mainly on Internet apps. Most of our projects are in machine learning, some in infrastructure (cloud), provisioning and the rest are mobile applications. We are lucky, because all of these are of high interest for many industries today.

What are the essential ingredients for our research? For machine learning we need data, actually we need big data. Big data needs big infrastructure.

What is the industry situation? The leading Internet companies have pioneered big data processing and currently are capitalizing on this success. Many have developed large data centers and many are offering them as a service, but the smart algorithms are kept in house. There are many medium and small companies generating large data, but without smart algorithms. They do not have the research resources, investments and know-how to develop them.

Our target is to develop useful applications and algorithms. For that we need to understand what are our customers needs. They are very different in different, sometimes the law requires physical location of data in the home country or in EU. Some want to keep all the data on premises. Sometimes the customer needs to use both public as well as private resources. Every company has the data in different format. Each company wants to extract different information from data.

What should we do at universities?  We have to find partners in the industry to get the big data and good problem formulation.

Who are the best partners for university? These are the companies who generate the big data, who already feel the need for smart SW and who want to differentiate on the market. We must look at companies with established internal processes, who understand that the university can deliver the algorithm but cannot be responsible for integration or product delivery.

We can research custom solutions for our partners. The open source is another important distribution channel for university  The open source smart algorithms must stick to standards. To make the algorithms useful for medium and small companies we need to offer the solutions along with platforms for building, provisioning, managing and monitoring. The tools must run on popular APIs, for example AWS, which is becoming a standard for cloud.

The companies needs for data location and processing will differ. Some will require keeping the data in particular locations on private or hybrid clouds. These requirements are best satisfied deploying to open source cloud operating systems (Eucalyptus, OpenStack, …), which can be easily installed on private clouds. But the provisioning and management systems are nonexistent or in their infancy. This is opening another track for research and development of smart algorithms along with provisioning, management and monitoring easy to deploy to standard clouds. In addition these offering must install on private, public and hybrids cloud too.

Here is the conclusion: search the partners who can share the big data with us and help them designing smart applications. For other customers provide smart applications with provisioning, managing and monitoring on open source clouds.

BTW I am looking for new doctoral students ...

2 comments:

  1. What kind of machine learning problems are you working on?

    We have lots of data and are open to collaboration with your students.

    Michal Illich, http://magictable.com/

    ReplyDelete
  2. Cloud computing is the advancement in the earlier technology which enables the utilization of resources and data from invisible and virtual protocols. In other words, the predecessor of this technology helped in making these businesses virtual and online.

    Cloud Migration
    Cloud Monitoring
    Ecommerce Hosting
    Forum Hosting
    Cloud for Developers

    ReplyDelete