At Kairos, one of our goals is to connect people to one another through our technology. To do this we need to know how developers use our products. What are your challenges and where are you finding success? Not only does this information help us create a better product and experience for you, it helps us all learn and unite as a community.
Every now and then Kairos will be featuring an interview with our users allowing them to share their journey, knowledge, and advice with our API and SDKs.
This time we're looking at Himel Mondal (pictured above, far left), an engineering student from Canada. Himel caught our eye when he posted his coding project to Hackster. He mashed up a Raspberry Pi, Amazon Alexa, and Kairos Face Recognition. The results were, in a word, amazing -- checkout a video demo at the end of this article.
Meet the Developer
Himel Mondal is a second year Mechatronics Engineering student at the University of Waterloo in Waterloo, Ontario, Canada. His program combines disciplines such as Mechanical, Electrical, Systems and Software Engineering to be used in fields such as robotics. He has a particular interesting in the software aspect and how Machine Learning can be interfaced with robotic systems.
Piggybacking on his interests in mathematics, programming is something that clicked and is something that he knew he wanted to pursue. The fact that programming touches so many different fields and has the ability to create almost anything is what really pushed him to pursue it.
With the emergence of Machine Learning and Deep Learning, Himel was amazed that computers could actually learn and perform tasks that were once thought of as being impossible to do.
Looking more into Machine Learning and wanting to try it out, he wanted to make something that utilized this technology, and that's when he and his friend, Abbass Ayoub, brainstormed a project that would incorporate machine learning of some sort. This pursuit eventually led them to facial recognition and that's how they discovered Kairos.
Being new to Machine Learning, Himel found Kairos' easy-to-use API helped them implement facial recognition into their project, which ended up as 'Alexa, Who's At The Door?'.
Tell us about your idea for 'Alexa, Who's At The Door?' and your thought process behind it.
'Alexa, Who's At The Door?' was the result of thinking about how we could make this new voice interaction software from Amazon - called Alexa - more intelligent. This project utilizes a camera, a Raspberry Pi, Amazon Alexa, and Kairos Facial Recognition in order to allow a user to ask Alexa about who exactly is knocking at their door.
The interaction of the app is as follows:
- Someone knocks on your door and you hear their knock.
- You ask "Alexa, who's at the door?".
- Alexa takes a picture of the person at your door and analyzes their face using the Kairos API.
- If their face exists in the repository of faces you have approved, Alexa will tell you that person is at your door.
- If the person isn't recognized, Alexa will let you know that an unrecognized person is at the door.
- You can look through your door peephole to see who's there, and if you recognize them, you can tell Alexa to train their face into the repository of faces by giving them a name.
- Alexa is now smarter and knows one more person.
This entire application is controlled via voice on the Amazon Echo.
What made you decide to choose Kairos and our technology?
This project was created in the span of 2 weeks over the summer. Due to the inexperience between Abbass and I, we were on the search for an API that could help us out. When we found Kairos, it immediately stood out as something that we could utilize so I decided to test out the facial recognition API with pictures of the rapper's faces including Drake and Kendrick Lamar. To my surprise, the API worked very nicely and so I decided to use it in our project.
Can you explain to us the process in which you created your application and how our technology fit into it?
It all started with us wanting to create something cool that hasn't been made before. From that, over brainstorming sessions on Skype, a high level overview of how the application was going to work was devised.
From there, Abbass worked on integrating the Raspberry Pi and the camera with this project to be able to communicate with Amazon Web Services over the internet. I worked on using the Kairos API, setting up the Alexa application, and interfacing Amazon Lambda with Firebase for Raspberry Pi communication.
When the Alexa voice interaction would be initiated, and a message that a picture on the raspberry pi had been taken was confirmed, the Kairos Face Recognition API was used on that picture to analyze if it could recognize who was in that picture. If they weren't recognized, then another endpoint on the API was utilized to train a new name and face to the repository of faces.
The Kairos API helped us focus more on making the communication between the Amazon Echo and the Raspberry Pi better due to the time we saved not implementing facial recognition on its own, which is its own beast.
What challenges did you face while building the application? Any success?
Structuring of the interaction flow in order to have a great user experience was definitely a challenge due to the number of ways this experience can be approached from. We worked out this challenge by making a visual flow chart of all of the ways a user could interact with our application.
Himel's voice-user interface (VUI) design.
We posted step-by-step instructions of how to set up this project on a platform called Hackster.io. In terms of success, this project gained a lot of traction relative to the community size on that site with 12,000+ views and 100+ respects. I'd also like to think that this project was one of the reasons why I got an offer to my next internship at an augmented reality startup.
What’s next for 'Alexa, Who's At The Door?'? Do you plan on further refining it?
I'd like to polish up the project with my partner and make it scale to production in order to deploy it to the Alexa store soon. This would allow this project to gain even more traction. A further possible enhancement would be to make it compatible with existing consumer security cameras such as the NestCam, so Alexa users wouldn't have to buy and set up a Raspberry Pi.
What are you working on now?
I'm currently the object classification team lead for my university's autonomous vehicle team. I'm learning a lot about the integration of deep learning in computer vision. I also plan to utilize these skills at my upcoming internship. I'm also always looking to create new side projects whenever I have free time, so expect to see more activity on my GitHub page soon.
What do you think about Human Analytics (facial recognition and emotion analysis specifically) integration with technology?
On top of facial recognition, I think emotional analysis could definitely have a profound impact on the level of security that this application could provide. Estimating the current emotional state of a person at the front of your door or around your premises could indicate malicious intent and help prevent dangerous situations from occurring.
How did you get into coding and what resources helped you learn more about the field?
My journey in programming started early in the 10th grade at my high school computer science class. From there, a mix of school, hackathons, side projects and internships has led me to continually improve my knowledge and skills in this field. Doing these things also boosts my creativity and problem solving skills when I think about how my code can positively affect something.
What do you think you will be doing 10 years from now?
I see myself as the head of a large Artificial Intelligence hardware company.
As promised here is a full video demo of 'Alexa, who's at the door?'
Learn More and Get Involved
At Kairos we love discovering developers who have come up with inventive ideas using our technology. We are grateful to have been able to connect with Himel and Abbass, and for sharing their story with us. If you're feeling inspired, create a Kairos account and start hacking today!