Wednesday, July 20, 2016

eClub Summer Camp IoT, Machine Learning

the eClub Summer Camp is in full swing and our lab is full of students. The projects can be divided into two groups IoT and in Machine Learning.

The IoT group is busy with projecting architecture for connecting HUBs with Cloud servers. We assume the cloud will have to serve millions of HUBs collecting the information from sensors and controlling the actuators. We are discussing and making predictions how many events we will collect from sensors, how active will be the smartphone users, how much of administration traffic (heartbeats, updates, etc.) we will have to support. The users' profiles, sensors, and HUB configuration need to be maintained in databases. We also plan to save all logs to provide access to historical data. There will be probably two different systems one for handling the incoming data and another one for storing the logs. Haboop with HDFS seems to be the choice for managing the logs. SPARK for filtering and managing the events. Of course, security is one of the most important features of the system and we are busily studying communication protocols. It is a large project and many students work on preparing the specification and testing parts of the design. Our goal is to create a proof of concept showing the HUB CLOUD communication still this year.

We have also a large group focusing on conversational systems. The work is centered around the open source YodaQA factoid answering engine. It has been inspired by the Watson Jeopardy system. It already answers English questions. Our major task is to convert it to Czech and improve the functionality. We are working on the integration of WikiData knowledge DB and we have to retrain a lot of the NLU blocks to Czech. One of the students works on creating a Czech parser model for the Google SyntaxNet parser,  

We are also looking at bots, which are good for creating of simple conversational apps for example for controlling a simple home IoT. The bots technology is based on information retrieval approach. We try to search for the best answer for a particular question. In this field, we have been working on Sentence Pair Similarity algorithms, which can be trained to recognize the question intent. Students are also looking at new development packages such as wit.ai, api.ai, Microsoft LUIS, land others. We try to develop examples of small simple applications. We believe the hands-on experience will help us to understand where are the limitations of the small, IR base systems and where we need to opt for the YodaQA technology.  The laest interesting but the essential part of our effort is creating training databases. Everybody is involved in the hard work of data set collection.

If you are interested, join us visit us we may help you to select an interesting project. There is still time to join.

Tuesday, May 31, 2016

X.GLU startup in eClub

Last week I have visited the Pioneers festival in Vienna. This was also the first public presentation of the new eClub startup X.GLU.

The X.GLU startup has developed a revolutionary glucometer called X.GLU. It is the smallest glucose meter, it is the size of a credit card and simply slips to your wallet. X.GLU requires no batteries and no wires to read the sugar level on your smartphone. As long as your smartphone is charged, the glucose meter works. No maintenance required. The X.GLU uses a standard connector for a biomedical sensor paper. It comes in a convenient bag along with disinfection tissues, lancets, and testing strips. The read out is transmitted by the NFC technology providing a secure wireless link between X.GLU  and the smartphone. Unlike Bluetooth, Wi-Fi, and similar wireless technology, the NFC cannot be sniffed from a distance of more than several inches. The measured values are displayed and stored in the smartphone. An encrypted connection sends the X.GLU data in a cloud and makes it available to physicians providing instant feedback in treatment.

The smartphone app comes with a how-to video. It shows detailed instructions on, how to treat the skin before taking the sample and the method for properly taking the blood sample. The app also conveniently reminds the user about  the scheduled measurement time.

The mastermind of the new company is the inventor and owner Marek Novak, who came with the idea of glucometer. Marek is one of the most active students in eClub. He has worked already on several IoT-related projects, but X.GLU is the first one we want to get to production. eClub helped in complementing Marek’s knowledge and found experts in sales and marketing to create a functional company. They start their operation from our scientific incubator.

It is great news for eClub. We all will try to do our best helping to start a productive and successful path to market. We are looking for other students teams with startup ideas. Join us during the eClub Summer Camp.

Tuesday, May 10, 2016

Our projects part 2

This is the second part of “What we do” this time about the IoT activities. 

Our IoT effort can be roughly divided into two parts SW infrastructure and sensors. We use the standard IoT architecture combining an HUB and a cloud server. It is a typical IoT system setup allowing to collect the sensors information and control actuators over the Internet. The architecture uses an HUB. It serves as a gateway to the Internet and concentrator for the sensor data. The HUB is a simple computer with a similar power as a router  equipped with Ethernet or WiFi or both to connect to the Internet. In addition, it may have several other radios for the sensor, actuator communication. The radios are continuously listening to sensors and this typically requires power, therefore, HUBs are usually not powered from a battery. 

Our HUB is based on the Intel Edison dual core 500 Mhz Linux-based embedded computer with WiFi, BLE, and 868MHz free band radio. It also includes three USB sockets for additional peripherals. The HUB is running a simple Node JS server Zetta. This server is handling the management and communication with servers. It allows a seamless connection to similar Zetta server residing in the Cloud. The linked Zetta servers communicate using a walkable, JSON based, hypermedia Siren. The cloud-based servers allow a simple connection to the smartphone. The hypermedia Siren allows the smartphone to set the UI based on the configuration of a particular space covered by an HUB. We have designed and implemented an IoT control app for Android and it configures based on location. In practice, it means as soon as you get to a smart room or to your car the Android home page sets for the particular environment with the most frequently used control on top. 

We do not use the WiFi for communicating with sensors. WiFi usually requires a lot of battery power and it is primarily designed for TCP/IP protocol, which may not be required for the very simple sensors such as thermometers. The thermometer is sampling the environment temperature for example only every 10 min and therefore we can let the sensor sleep most of the time. The radio is waking up only for the shortest possible communication required to exchange information with the HUB. This approach is allowing us to design sensors with very low energy requirements. 

The low sensors consumption allowed us to use one of the energy harvesting approaches, the Photovoltaic Cells. We have designed and put together a set of PV powered battery-less and wireless sensors. We can measure temperature, humidity, motion (accelerometers and PIR). The PIR equipped sensor is powered just from a fluorescent tube on the ceiling and it is sensing people coming to our lab for more than one year. We are monitoring the PV accumulated energy and we have so far never run out of power. We use the accelerometer-equipped sensors to check for open windows. The outside light is also good enough to provide enough juice. The sensors communicate with the HUB using 868MHz radios. We have found this band more resistant to objects than the WiFi or Bluetooth. Currently, we use a proprietary protocol, but we are looking for LoRa and MQTT, which we use in other projects with the same HUB.


Some of the described work is part of the Bachelor’s thesis written by my students. We are looking forward to pushing our work even more ahead during eClub Summer Camp 2016

Tuesday, May 3, 2016

Our projects

Recently I was asked to review the latest development in our group and I realised how much work we have done. I have also noticed I am forgetting about my blog. Let’s fix it!

First the best news, our group has grown during the last year to five PhD and around 10 MSc students working in machine learning. 

Today I would like to start with part one and mention some of our progress in machine learning. In the second part I will describe our IoT effort. The main machine learning topics can be broken in the following categories:
  • Natural Language Processing
    • Question answering YodaQA
    • Intelligent assistants
    • Sentence pair similarity
    • Multinomial classification
  • Information retrieval
    • Learning to rank 
  • Information extraction
    • Focused crawling
    • Convolutional Neural Networks
      • Combining text and images
      • Image labeling 
I’ll start with the major achievement, the YodaQA answering machine. It is an open source question answering system. It implements state-of-art methods of information extraction and natural language understanding — to answer human-phrased questions! You can try the live demo

Along with YodaQA we have worked also on simpler Intelligent Assistants acting on a smaller number of commands. They take advantage from simpler algorithms finding the most similar answer for a given query. The sentence pair similarity is another topic of interest. The algorithms can help solving not only Answer Sentence Selection, but also other interesting problems, such as Next Utterance Ranking, Semantic Textual Similarity, Paraphrase Identification, Recognizing Textual Entailment etc. We have tested and developed a series of algorithms based on word embeddings and different architecture of Neural Networks. 

To the NLP category belongs also the multinomial classification algorithm. The use case we are testing is the products categorization to a hierarchical directory structure. Typically the e-shops are categorizing products, such as  a “14 inch screen notebook” under notebooks, computers, electronics etc. This process is handled by human beings and they do mistakes, our algorithm can find problematically categorized entries or suggest correct category. 

An exclusive position in our group has the adaptive ranking research. The web content, its information relevancy, authority and the users interests are changing constantly.  The goal of any search engine is to provide exactly what the users look for. The newly developed algorithm relies on users to find currently the best ranking. It constantly observes on what links are users clicking and it adapts based on this feedback. This is very relevant for information search and recommendation services.

Information extraction is the next topic completing our portfolio. Initially we have looked at basics of focused crawling, a strategy how to crawl internet and extract for example all mentions about AT&T and Linux. This leads to a design of a crawler with programmable search policy. Currently we work on even more sophisticated algorithm for extracting content from e-shop pages, segmentation, price, product name etc. extraction. These are known problems typically solved semi-manually constructing scripts and then running the extraction. Our goal is a high accuracy, general algorithms without any customization or training working for all e-shops.

In the part two of this blog I will review our efforts in the Internet of Things.


Friday, December 11, 2015

YodaQA

n the eClub Summer Camp, Peter Baudis has developed the YodaQA, question answering system. Today we are publishing the first presentation of the technology.

Friday, June 26, 2015

Five years at CVUT

I celebrate a five years anniversary at the CVUT Faculty of Electrical Engineering. The academic year 2014/15 is ending and this is an opportunity to a little summary.

I have left Google five years ago and joined the CVUT FEL. That time the startups movement was flourishing in Czechia and I had started the eClub as a platform helping the student’s startups. The idea was to inspire students and show them different career options. I have organized numerous motivational presentations and meetings with successful entrepreneurs. Especially at the beginning I was happy to attract crowds of students. However, after some time I have realized that there are many more students who are interested in doing something real, but do not want to be involved in business.

Beside the entrepreneurial activities my main job was teaching. I was giving classes in the Internet Applications Development and the last two years also the Big Data courses. I have met many great students interested in the latest technology as well as in math and programming. I also was excited to guide many MSc and BSc students and helping them with theses projects. I am also coaching many PhD students, but over the years I have met only a minimum number of students who had created a startup.

Not too many, startups is it strange? Here are some reasons: the PhD’s are attracted by research and theoretical work. MSc and BSc students are very busy to complete all the school exercises. I would say the more responsible students with better grades the less they are tempted by startups. In between many new accelerators and incubators are attracting the students team and are helping them to start the entrepreneurial carrier.

Al in all these reasons led me to slight modification of the original eClub idea and I have organized the first eClub Summer Camp (ESC) last year. I simply created a web page inviting students to join us working on industrial projects during the summer. I was surprised, I have found students interest to work with the Cloud Computing Center (3C) team. The results were also very good and students were happy. Many students work with us till today.

We start this year again with a new round the ESC 2015. Our focus is again on Big Data, Analytics and Mobile. Since last year we  have also gained a great support from the industry. We acquired great partners Seznam.cz, AVAST, Alza, Jablotron, Altron, Certicon etc. who are helping us with scholarships. The money is great but the most important is they are asking us to solve real problems. For each problem we get a mentor, a company employee and on top of that we get the real data. This is essential for our machine learning and AI research. There is no better data than more data.

Another big step ahead for eClub is the new offices space we have opened at the beginning of this year. Certicon one of our partners is kindly sponsoring the rent.

Cloud Computing Center (3C) is my group at the CVUT FEL researching the same field as eClub. Here are concentrated all the PhD's, here you can find the latest MSc a nd BSC projects. The eClub is very complementary to the 3C research activities and this is giving us many opportunities for research as well as for teaching. There is a great correlation between the 3C and eClub projects. Many eClubbers continue working with PhD students in research or on their theses. The Summer Camp became a good selection for finding talented students. I believe, that the eClub Summer Camp is complementary and there are not too many opportunities of this type for students. eClub is filling this gap.

I am also not forgetting about the startups. Certainly, teams interested in building a startup will be welcomed in eClub. I am volunteering as a mentor in several incubators and I am trying to follow the whole landscape of new high tech companies. I hope, some of the eClubbers will became great researchers as well as entrepreneurs in the future.

If you still want to join us, check our web pages. Stay tuned an follow me, I will try to report about projects and progress too.

Friday, April 17, 2015

eClub Summer Camp, the best way to enjoy summer

I am proud to announce the eClub Summer Camp (ESC) 2015. Students join us to work on Big Data, Machine learning, Mobile apps, IoT etc. We offer great projects and scholarships. Startups and students with their own projects are welcomed too.

About twenty five students joined us in eClub Summer Camp (ESC) last summer. ESC participants were excited and many already asked when we start this summer.

When do we start? The good news is we are already opened in the new eClub incubator in The Blox, Evropská 11, 160 00 Praha 6.

Who can join?. ESC is for students, foreign students, doctoral students, anybody who is interested in building something new and innovative. We will also welcome startups teams with their own interesting projects.

What are the requirements? ESC will be opened for all university students in academic year 2015/16. We require a commitment of working minimum of 6 consecutive weeks in the ESC incubator on a chosen project.

What are the benefits? You may chose from a large number of interesting industrial projects. eClub will provide experienced mentors to help getting productive as soon as possible. We will organize presentations about new technologies, entrepreneurship introductory talks delivered by top experts etc. We will offer a scholarships up to ten thousands CZK per month. You will gain access to a large computer cluster.

What are the projects? We have attracted many more partners this year. ČVUT and companies Seznam, Avast, Jablotron, Certicon, Alza, Altron and the ČVUT Medialab foundation will support us not only with scholarships, but they are also providing interesting projects and data. The projects focus on Data, Mobile and Startup. The Data category includes projects from many areas, such as Big Data, Cloud Computing, Machine Learning, Artificial Intelligence etc. In Mobile we want to welcome students or teams focusing on mobile devices, wearables and Internet of Things. We will also consider any suggestions for interesting projects in any high-tech field.

If you are interested let us know by filling-in this questionnaire, please. Stay tuned we will get back with more details soon. If you have any questions, just ask. Make your summer productive!