Jan Sedivy: Just some comments

Friday, February 1, 2019

The Alquist Technology Presentaion

We hosted an Alquist meeting for our friends and partners this Wednesday (1/30/19). Our goal was to share the Conversational AI technology of the twice in row second (Alexa Prize) Alquist. Here is a copy if you have any question about technology, or if you want to use it, or if you want to join us do not hesitate and drop us an email. Let me give you a short summary.

In the introduction, we pointed out the recent advances in AI. Deep learning improved the speech recognition accuracy. Google is one percentage point better than humans in 2017. Advances in speech accuracy enabled the new type of devices the intelligent speakers. The best examples are Alexa and Google Home. They are the target platforms for Alquist. The initial presentation slides explained the underlying architecture and how the intelligent speakers carry on the conversation.

In the following part of the presentation, Jan explained the very basics of the Alquist Natural Language Understanding (NLU). He has shown some of the details of our keyword extraction and named entity recognition implementation. Next, he has very quickly mentioned the dialog acts recognition and the profanity detector. Profanity detection is a challenging problem essential to keep the bot pleasant to talk to. See what kind of language we need to fight.

When people meet they start chatting. There is an endless number of conversational topics. We tried to teach Alquist the most frequent ones. The current version has 30 topics, such as sport, politics, travel, etc. The Dialog Designer (DD) plays the most important role in making the topic entertaining and interesting. He invents the dialogs. Usually, the bot starts the conversation uttering an initial sentence. The user replies. We call the human bot exchange a turn. Each topic is then a set of a large number of turns. The DD prepares the turns for Alquist in the form of templates.

Petr is responsible for a dialog manager (DM) ordering the turns in a sensible dialog. In his MSc dissertation, he has developed an original hybrid DM, which we use in Alquist. The offline processing DM has two parts a graphical UI and the training set generation. The DD generates a dialog tree (composition of turns) from graphical dialog objects for a topic. The second part of the offline DM processes the topic dialog turns, and it automatically creates a rich set of sentences, the training set. We use the set to train an LSTM neural net, which predicts the next sentence on a topic. The production DM runtime is a mixture of several rules and an NN for each dialog topic. Overall such a system handles a large number of different sentences. The hybrid design makes the Alquist conversation much more robust.

We also wanted to show how to use the Alquist technology for other modality. With the Rebel & Glory, we have put together an interactive movie about Alquist. You will learn how we have chosen the name Alquist, and you can learn about Karel Capek, watch it. We also have prepared a short interactive invitation for our presentation. The team has first shot the clip and wrote the dialogs. Ondrej even acted as a guide. This movie may inspire you how to author interactive ads or simple product introductions. We are looking forward to working with creative teams on the conversational part of entertaining clips. If you have an interesting idea, let us know. In the presentation, we have pointed out some of the unexpected questions. Try them!

The last speaker was Ondrej who is our DD. He is inventing and designing the dialogs. The dialogs have two parts the Alquist messages, and the user replies. The DD needs to work on both. He predicts what will be the user interested in and he generates the templates to react on the user's requests. Language is very complicated. The user does not want to hear repeated or tiresome replies. The conversation must be interesting. Therefor Alquist creates responses form templates as well as using generative algorithms. Ondrej has also overviewed some of the underlying problems and linguistic peculiarities he had encountered.

The Alquist presentation has attracted almost eighty people to our beautiful top floor presentation room of the CIIRC building. We were very excited by a fruitful discussion with plenty of questions at the end of our meeting. If you are interested, you may go through the presentation too. Let us know if you like it. Enjoy and if you have any questions let us know. We would also be happy to hear your ideas for new applications.

Wednesday, December 26, 2018

Alquist PF 2019, the Alexa Prize Finalist.

Alquist PF 2019

We have been second in a row in the Alexa Prize out of more than 100 academic teams with our social bot Alquist. Last month the students received a 100K USD prize during the Amazon re:Invent 2018 conference in Las Vegas. We are back in CTU CIIRC, in our lab again. During the long overseas flights, we started to think what to do next. We want to present our technology in a slightly different, but more entertaining form. The end of the year, the holiday season is a great chance to engage with our friends, partners, students, all people. It was a simple decision. We have prepared an interactive PF 2019, a sneak preview of how creative we can be with the Conversational AI.

Recently Amazon updated Echo Show, and Google introduced the Google Home Hub. These, screen equipped devices, are combining the Conversational AI with visual to make the user experience even more engaging. In Alquist, we believe in multimodal interfaces too. The most ubiquitous multimodal device is definitely a smartphone. The trend is clear. Google and Apple are continually reporting increasing numbers of voice interactions.

Currently, our demo needs the Chrome browser. It is a limitation, but you can run it on a smartphone and on a desktop either on macOS or Windows. One of the big problems for the Google Home or Alexa is the discoverability. Choosing the smartphone and the browser we can post just a link, no installation is required. We are not bound to existing platforms. The user only needs to allow the browser to access the mic. We still do not support the iPhone, which is not allowing the mic access. If you own an Android smartphone, you are only one click away. Try it!

We have partnered with Dazzle Pictures. They created a fantastic animation in zero time, and we designed and implemented the dialogs. At first glance, it may seem pretty simple answering ambiguously, but do not be mistaken. Give it a try, test it! For example, Alquist understands the standard English set of names only. Try how the dialog changes if it does not recognize your name correctly. Discover how the snowman replies once you utter something not understandable. We also handle a positive and negative message etc. Have fun and let us know what you think!

Check out also our first video supported demo the Alquist Story. It is high quality movie with real actors introducing Alquist. You can also ask many questions about Karel Capek. It took us a very long time to get it to a reasonable shape. We spend time shooting the movie, a lot of time in post processing and we had to overcome various technical problems too. On the other hand the animated PF 2019 was a quick shot, all was prepared, in good shape.

We believe there are many opportunities for using our technology in many different business segments. We can imagine answering product questions, ask users for preferences, help set up devices, etc. We are looking for more ideas about how to commercialize our know-how. Help us to discover new opportunities!

Thursday, November 29, 2018

We are second again

We have made it again with the Alquist bot. Alquist competed with the Alana bot from the Heriot-Watt University, Edinburgh, Scotland and the winning Gunrock bot from the University of California Davis Davis, CA, USA.

A $500,000 prize was awarded to the winning team. We are bringing back $100,000 in prize money, and Alana receives $50,000, The challenge for additional, a $1 million research grant has not been awarded yet. It will take some time to make 20 minutes long chat. Just imagine how difficult it will be to get to a bar and talk to a stranger for 20 min.

The Alexa Prize is a $3.5 million challenge for university teams to advance human-computer interaction. Similarly, as last year the goal was to develop the best social bot conversing coherently and engagingly with humans on a range of current events and favorite topics such as entertainment, sports, politics, technology, and fashion. We continued this year with the Alquist II starting from the beginning of 2018 when the Amazon Alexa Prize was announced. We submitted our proposal, and we have made it in between eight semifinalists who were selected from more than a hundred teams from 15 countries. Amazon has awarded us with a $250,000 research grant, Alexa-enabled devices, and free Amazon Web Services (AWS) to support our development efforts.

The research grant was significant support for our team The team leader as last year was Jan Pichl who is pursuing the third year of his Ph.D. program at the faculty of Electrical Engineering in Conversational AI. This year charged with enthusiasm the team decided to drop the first version of Alquist and started from scratch with an entirely superior, new design. We built on the latest neural network technology in combination with a small number of rules to conduct the dialogs. Alquist II knows how to react to most conversational utterances, but it excels in 26 selected topics. A great deal in the quality improvement came from a large number of users conversing with our bot. Each conversation helps to understand better the complexity and select the best matching reply.

In-depth knowledge is required to create an exciting and entertaining conversation. Where to get the content? The web is an endless source of interesting facts, but mostly in a written text. When played back it feels too long and a little tedious. To make the conversation natural, we had to solve this problem. The large part of work went to the knowledge acquisition and processing. If you are a lucky owner of Alexa device, you can test the Alquist abilities, just say let’s chat with Alquist.

The Alexa Price winners announcement was part of the AWS re:invent conference in Las Vegas. The finalists were invited. We all have enjoyed a grand celebration as well as the gathering.

Amazon is investing a lot in the development of intelligent conversational gadgets led by Alexa. The experts predict that the most natural way for communication, the speech will become in the nearest future an additional channel to control appliances, access knowledge, etc. It is fascinating and inspiring to find our team between the leading groups in the world working on the latest technology with an exciting vision. We wish our success will attract new students to join our team an pursue our adventure next year. Let us know!

Friday, August 31, 2018

We are the Alexa Prize finalists again

We have made it to the Alexa Prize 2018 finals again with our social bot Alquist. Our competitors are the Alana bot from the Heriot-Watt University, Edinburgh, Scotland and the Gunrock bot from the UC Davis, Davis, CA.

It was almost exactly one year ago I wrote the last blog. That time we were excited to get to the Alexa Prize 2017 finals, and we celebrate today again, we made it to the finals with the Alquist team again. It was a hectic time.

We have completely redesigned our bot. This year when we started the semifinals, we experienced problems with data to train the new AI. As the number of interactions was growing, we were increasing the training sets and improving the accuracy. We have augmented the dialog acts classifier processing every new user utterance. It is using the convolutional neural network and classifies the utterances to around thirty classes. The significant change in the overall architecture is the dialog manager. Last year we used a rule-based approach. It was great for cooperative users, but once the user did something unexpected we had troubles. It was also a very laborious process to write the rules. We ended with hundreds and hundreds of rules. It was also challenging to update or enhance the dialogs. The latest Alquist uses hybrid dialog management. We have reduced the rule-based decision to a minimum and made the principal part controlled by an LSTM neural network. We have many LSTM models for different sub-dialogs. The sub-dialogs are trained and updated for excerpts of the bot user interactions. The hybrid approach significantly reduced the amount of work necessary to create a new dialogue compared to last year's rule-based approach. This fact allowed us to broaden the range of conversational topics substantially. We have also taken advantage of delexicalizing the utterances to improve the training speed. The bot includes several other neural networks helping to switch between different topics, estimating the sentiment, etc. The whole system is getting quite complicated. We have also spend a lot of efforts on improving the new information acquisition. We are crawling several social media. The discussions are an additional source of interesting facts. The social media are a great complement to knowledge databases with the factoid type of information like for example Wikipedia.

The team has changed a little compared to the last year. Roman has left, and Petr Lorenc has joined. He is helping a lot with the intent, entity recognition, which is an essential part of Alquist and has a significant impact on the overall user experience. Currently, everybody is very busy since we have another at least two months to improve the functionality. We will focus on the user experience. Since English is not our native language, we have to spend a lot of effort ironing out all conversation, adding SSML, etc. Amazon will offer to the Alexa device owners only three first bots, which means we will get more data. More data gives us a chance to improve further the accuracy.

Amazon will announce the winners as last year at the re:Invent Amazon Conference in Las Vegas. We are looking forward to visiting Las Vegas the heart of gambling, as well as meeting our competitors and helpful Amazon Alexa Prize staff, as well as learning the latest from the Amazon technology. We were second behind the Washington team last year. Guess what are our aspirations this year. If you are a lucky Alexa device owner, try "Alexa let's chat." Keep the fingers crossed!

Wednesday, August 30, 2017

Alquist made it to the Alexa finals

The CVUT Alquist team managed to get with other two teams to the finals of a $2.5 million Alexa Prize, university competition. Our team has developed the Alquist social bot.

The whole team has met in the eClub during summer 2016. That time we have been working on a question answering system YodaQA. YodaQA is a somewhat complex system, and students learned the classic NLP. Of course, everybody wanted to use Neural Networks and design End to End systems. That time we have also been playing with simple conversational systems for home automation. Surprisingly Amazon announced the Alexa Prize and all clicked together. We have quickly put together the team and submitted a proposal. One Ph.D., three MSc, and one BSc student completed a team with strong experience in NLP. In the beginning, we have been competing with more than a hundred academic teams trying to get to the top twelve and receive the 100k USD scholarship funding. We were lucky, and once we were selected in November 2016, we began working hard. We started with many different incarnations of NNs (LSTM, GRU, attention NN, ....) but soon we have realized the bigger problem, a lack of high-quality training data. We tried to use many, movies scripts, Reddit dialogues, and many others with mixed results. The systems performed poorly. Sometimes they picked an interesting answer, but mostly the replies were very generic and boring. We have humbly returned to the classical information retrieval approach with a bunch of rules. The final design is a combination of the traditional approach and some NNs. We have finally managed to put together at least a little reasonable system keeping up with a human for at least tenths of seconds. Here started the forced labor. We have invented and implemented several paradigms for authoring the dialogues and acquiring knowledge from the Internet. As a first topic, we have chosen movies since it is also our favorite topic. Then, we have step by step added more and more other dialogues. While perfecting dialogues, we have been improving the IR algorithms. We had improved the user experience when Amazon introduced the SSML. Since then Alexa voice started to sound more natural.

While developing Alquist, we have gained a lot of experience. A significant change is a fact that we have to look at Alquist more as a product than an interesting university experiment. The consequences are dramatic. We need to keep Alquist running, which means we must very well test a new version. Conversational applications testing is by itself a research problem. We have designed software to evaluate users behavior statistically. First, a task is to find dialogues problems, misunderstanding, etc. Second, we try to estimate how happy are users with particular parts of the conversation to make further improvements. Thanks to the Amazon we have reasonably significant traffic, and while we are storing all conversations, we can accumulate a large amount of data for new experiments. Extensive data is a necessary condition for training more advanced systems. We have many new ideas in mind for enhancing the dialogues. We will report about them in future posts.

Many thanks for the scholarship go to Amazon since it was a real blessing for our team. It helped us to keep the team together with a single focus for a real task. Students worked hard for more than ten months, and it helped us to be successful.

Today we are thrilled we made it to the finals with the University of Washington in Seattle and their Sounding Board and the wild card team from Heriot-Watt University in Edinburgh, Scotland, with their What’s up Bot. Celebrate with us and keep the fingers crossed. There is a half a million at stake.

Tuesday, June 6, 2017

New projects for this summer

This year we are opening the eClub Summer Camp new CIIRC building. We have prepared exciting projects from the field of AI, IoT, and Internet. We will focus on conversational IA, how to program assistants to control your household, Natural Language Processing, and other topics, see the projects page.

Two years ago we started to work on the question answering engine YodaQA. Last year during the eClub Summer Camp we have designed the first bot. Our primary goal for this summer is to create a great Echo application. Echo is a voice controlled smart speaker made by Amazon. You can only ask to play music, ask factoid question, carry a simple dialog or control your household. There is an amazing technology behind the set of new Amazon devices. First of all the speech recognition, directional microphone, conversational AI, knowledge database, etc. The eClub team is among the first in the world working directly with the Amazon research group on making the Alexa even smarter. We want to make her sexy, catchy and entertaining and it requires a lot of different skills. Starting with the linguistics up to Neural Networks design. We have many well-separated problems for any level of expertise. Come to see us, we are preparing an introductory course to teach you how they do it. We will help you to create your first app with initial skills. You can meet a lot of students who work in the Conversational AI who will help you to get over the underlying problems.

We want to make the Conversational apps not only entertaining but also knowledgeable. Alexa must also be very informative. It must know for example the latest news in politics, the Stanley Cup results, what are the best movies and I am sure we can continue with many other topics. The knowledge is endless, and it is steadily growing. To handle to alway increasing data requires processing many news feeds, different sources, accessing different databases, accessing the web, etc. The news streams must be understood, and the essential information must be extracted. There are many steps before we retrieve the information. Especially today we need to be careful, and every piece of information must be verified. We try to create a canonical information using many sources of the same news. As soon as the information is clear, we need to store it in a knowledge database. The facts need to be linked to information already in the database. And how about the fake news, how to recognize them?

Building the Conversational AI does not include only the voice controlled devices. We may want to create a system automatically replying to the user email or social media requests. Imagine for example a helpdesk where users are asking many different questions from IT to HR topics. For example very frequently how to reset a password, or how to operate a printer or a projector, why not to answer them automatically? And we can be much more ambitious. Many devices are quite complex, and it is not easy to read a manual. It is much faster to ask a question such as “How do I reset my iPad,” or “How do I share my calendar.” These apps are put together from two major parts. The understanding of the question and a preparation of the answers. Both use the NLP pipeline. If you expand on this idea, you may find a million of applications with a similar scenario. An automated assistant can at least partly handle every company-customer interaction. To make a qualified decision, the executives need fast access to business intelligence. Why not ask questions such as “What was the company performance last week,” “What is the revenue of my competitors” etc.

Let me mention another aspect of our effort. The latest manufacturing lines are extensively using robots, manipulators, etc. (INDUSTRY 4.0) The whole process is controlled by a large number of computers. What if something stops working, it is a very complicated task to fix a line like this? Every robot or manipulator might be from a different manufacturer, programmable in a slightly different dialect. Is there anybody in the company who can absorb the complete knowledge to be useful in localizing the problem? Yes, it is a robot, which has all the knowledge in a structured form. The robot can apply optimization to find the best set of measurements or tests to help the maintenance technician. To make this happen, we need in addition to a productive dialog and knowledge database also an optimization to suggest the shortest path for fixing a problem. The robot can guide humans to repair the problem most efficiently.

Yes, I have almost forgotten. It is recently very popular to use the robots to control the household. Alexa, turn off all the lights. Alexa, what is the temperature in the wine seller? We want to invent and build some of these goodies to our new eClub space during the summer. Our colleagues have developed a Robot Barista application shaking drinks on demand. A voice user interface will make it even more entertaining. We have other exciting devices and small gizmos deserving voice control. You also may come with your ideas. Join us we will assist you to be successful.

These are just few use cases we will try to tackle during this season. If you want to learn the know-how behind join us, we will help you, and we also will award a scholarship.

Sunday, May 21, 2017

Conversational AI for Dungeons and Dragons

we start the 2017 eClub Summer Camp. eClub has moved to a new CIIRC building. We are competing in the Alexa Prize competition. Join eClub and learn the latest machine learning, NLP algorithms.

A few years ago we have started with a question answering system YodaQA. It has been inspired by the IBM Watson beating the best player in Jeopardy. Today we continue our journey in even more challenging projects. We are creating dialogs for the latest voice-controlled appliances. We have entered the Alexa Prize competition and it helped us to develop Alquist the social bots. We have a free access to an immense power of AWS, we are in constant touch with the Amazon research staff. Every member of the Alquist team became an NLP expert. The Alquist system is day by day getting better. Currently, Alquist can conduct a sensible short dialog. The user can choose from several topics: sports, politics, celebrities, jokes, etc.

Thousands of users are using the chat and we receive valuable logs making us very busy. It takes a lot of time to get through details, to discover why the user stopped the conversation but we are learning a lot. What is the social dialog? What are the catchy questions? How to respond quickly and interestingly? There are still a lot of questions, but we are ambitious, we want to extend the Alquist knowledge to handle a long and interesting dialog. If you like tested say Alexa let's chat.

The Natural Language Understanding (NLP) underlines Alquist. NLP is also the essential part of Siri, Cortana, Alexa, Google Assitant and other latest bots. The Conversational AI is building on machine learning, optimization etc., it takes advantage of all the latest development in machine learning, starting with the classical algorithms up to the latest deep neural network, sequence to sequence and memory networks etc. This summer we want to considerably improve the Alquist capabilities. To achieve our goals we need to enlarge the Alquist team and focus on the Conversational AI. If you are a BSc, MSc or Ph.D. student join us. We have various programs including Ph.D. candidates.

In addition to AI, we also need creative individuals knowing how to handle a dialog, being innovative. We all know that carrying an interesting dialog is an art. To teach Alquist interesting dialogs is even more complicated. Creative young people with many different skills in the human to the human conversation are welcome to join us.

A dialog is also about information and experience exchange. Imagine for example a bot helping you playing an adventure game. The RPG have many rules and it is very boring to search an information in a handbook. One of the very popular RPG games is the Dungeons and Dragons. We want to design an interactive Alexa D&D handbook and improve level and XP progression. If you are interested in D&D join us helping us to design a voice-controlled interactive manual.

The NLP space is huge and we have a large number of interesting topics to work on. If you are interested in AI, machine learning, neural nets etc. join us. We have great resources, a lot of experience and funds to award you scholarships.

Pages