Look at That Facial Expression!

Upon the invitation from NUS High School of Mathematics and Science, Tuan and Hady interacted with a group of 24 eager and precocious students in their Year 4 to Year 6 during a hands-on workshop on image classification.

The theme of the workshop was “Facial Expression Recognition with Neural Networks”.  It covered how to build an image classifier using neural networks, with an application to facial expression recognition, classifying whether a photo would indicate emotion such as happiness or sadness.

Tuan covering the neural network implementation of an image classifier using TensorFlow, with the students following a set of hands-on exercises

We covered several topics including multilayer perceptron and convolutional neural networks, first explaining the concepts then guiding the students through hands-on exercises to experiment with various neural network architectures.

Hady explaining convolution, an operation that helps neural networks to capture spatial features useful for image classification

To make the learning experience more entertaining and engaging, we put up a Web service that allowed students to upload their own photos to test the output classification (happy or sad).  It injected a lot of fun and exuberant moments, as the students were pushing the image classifier to the limit.  This helped the audience to appreciate the architectural designs and the model’s sensitivity to training data.

Students were having a blast testing the facial expression recognition Web service

Judging from the students’ facial expressions at the end of the workshop, we could see that they had enjoyed the lesson.  No, we did not need the image classifier to see that. For more happy faces, check out our tutorial materials.

In Defense of Comparisons

On 2 November 2018, Maksim defended his PhD dissertation entitled “Comparison Mining from Text” before the dissertation committee at SMU School of Information Systems.

In his own words:

Comparison mining aims at understanding the opinion mining problem when multiple entities are present simultaneously. This includes, but is not limited to deriving similarities and differences between entities and discovering information about the entity relations.

His dissertation aims at building a comparison mining system.  The identification of comparisons in opinionated text serves as the basis for mining comparative relations in terms of which entity is preferred over another.  In turn, these comparative relations may reveal several distinct preference groups, which may rank entities differently.  Finally, by aligning topics discovered from text corpora to the comparisons, we can explain why some entities may be preferred.

Maksim providing an overview of the components of his dissertation

The dissertation chapters are all buttressed by publications in the top venues in natural language processing (ACL’15, ACL’18), text mining (CIKM’14, CIKM’16, TKDE’17), and artificial intelligence (AAAI’19).  With such prolificness, no wonder he was awarded SMU Presidential Doctoral Fellowships twice in the consecutive years of 2017 and 2018.

A summary list of Maksim’s publications up to defense

On a nostalgic note, it has been an amazing journey since he first joined the group as a research assistant in 2013.  The journey’s twists and turns culminated in the PhD defense taking place in record time, less than 3.5 years since he was officially enrolled in the SMU PhD program.

Group outing in December 2013 (clockwise from left: Loc, Hady, Maksim, Thuy and Tuan Le, Trong Nguyen, Son)

Conquering defense is a momentous victory that calls for a celebration.  What better way to celebrate the setting of a high bar than going to a rooftop bar?

Group celebration in November 2018 (clockwise from left: Hoang, Trong Le, Maksim, Ween Jiann, Jingyao, Chong Cher, Zhang Ce, Hady, Andrew, Tuan Truong, Aghiles)

Congratulations Maksim for a job well done!

CIKM-2018 in Turino, Italy

In October, Andrew attended the 27th ACM International Conference on Information and Knowledge Management (CIKM 2018) that was organized in Torino, Italy. Here, he shares his experience of the Torino trip.


CIKM 2018 was hosted at the Centro Congressi Lingotto, which is part of the old Fiat factory in the Lingotto area of Turin. On top of this building, you can still visit the old Fiat’s roof driving track where all Fiat’s cars were tested.

Centro Congressi Lingotto
(Conference Venue)
CIKM Opening Ceremony

CIKM 2018, with the theme, “From Big Data and Big Information to Big Knowledge”, brought together leading researchers and developers from the knowledge management, information retrieval, and database communities. This year, CIKM featured 147 full papers, 96 short papers, 33 industry papers, and 26 demonstrations in both oral presentation and poster forms.


The conference happened from 22 October to 26 October 2018, starting with many activities: workshops, AnalytiCup, CIKM Connect, following by research paper presentations on the main conference days, posters and tutorials on the last day. This year, CIKM welcomed more than 800 participants from all over the world. One of the highlights of CIKM 2018 was the inspiring and insightful keynote speeches from three amazing invited speakers, addressing new research ideas as well as challenges in building real-world AI applications. The takeaway message for me from the three keynotes is that one should focus on solving hard, impactful research problems to better advance the progress in AI.

Maarten de Rijke on “Shifting Information Interactions”
Edward Grefenstette on “Teaching Artificial Agents to Understand Language by Modelling Reward”
Yoelle Maarek on “Alexa and Her Shopping Journey”

Our paper, titled “Multiperspective Graph-Theoretic Similarity Measure”, describing a graph-based framework for modeling multiple similarity perspectives in the data, was accepted as an oral presentation in CIKM 2018. The presentation was well received by the audience, followed up by a constructive QA session. Especially, there was one very interesting comment on the possible use of graph embedding for the problem of multiple perspective similarity learning that I would like to consider for future work.

Andrew is presenting our paper on
“Multiperspective Graph-Theoretic Similarity Measure”

The conference participants were also treated with appealing and delicious Italian food, not to mention free-flow of wine. It was also the time for people to network with each other and congratulate all the award winners (including Test Of Time Award, CIKM Analytic Competition, Best Full Paper, Best Short Paper, and Best Demo).


CIKM 2018 was also an opportunity to discover Turin, the first capital of Italy. The city is famous for many museums and churches with amazing architectures. I first visited the Egyptian Museum of Turin, which is known as the best and largest museum outside Egypt dedicated solely to Egyptian art and culture. This is a must-visit place if you go to Turin.

Appealing and delicious food in CIKM18 banquet

The second place I visited was the Mole Antonelliana, a major landmark building in Turin. It also houses the National Museum of Cinema, presenting pre-cinematographic optical devices such as magic lanterns, earlier and current film technologies, stage items from early Italian movies.


Turin was also pronounced for its car manufacturing industry with Fiat is the most dominating name. That is why on the last day in Turin, I decided to explore the National Automobile Museum, currently owning a collection of almost 200 cars among eighty automobile brands representing eight countries. The museum also has its own library, documentation centre, bookshop, and auditorium. If you are a car lover, you should definitively visit this building.

An old car from the Automobile Museum
A Sphinx statue from the Egyptian Museum

Lastly, as a football fan, I could not help but to visit Allianz Stadium, house of Juventus football club. I also visited the nearby Juventus museum to learn more about the rich tradition of this club. I got to see by my own eyes the Champion League cup and the Ballon d’Or Ball. What a fantastic experience!

Andrew at the entrance of
the Juventus Museum
Mini Mole Antonelliana from
the National Museum of Cinema

Enrichment Course on Web Data Extraction and Regression Analysis

Fired with an educational zeal, Preferred.AI conducted a 5-day (Oct 3-8 2018) enrichment course on Web Data Extraction and Regression Analysis, which was organized by the SMU School of Information Systems in conjunction with the impending launch of its BSc (Computer Science).

The instruction team, L to R: Max, Ween, Tuan, Jingyao, Hady, Andrew, Aghiles

We welcomed a group of 20 bright students from Singapore Polytechnic (SP), Ngee Ann Polytechnic (NP), and Temasek Polytechnic (TP).  They are part of the prestigious Industry Preparation for Pre-graduate (iPREP) Programme or Infocomm Polytechnic (iPoly) Scholarship run by IMDA.


Through the course, we inculcated the appreciation of a data science pipeline, by requiring the students to go through the different stages of data collection, feature engineering, and building a prediction model using regression, culminating on an end-to-end project.


On Day 1, Ween anchored the data collection module of the course, which is based on Venom, a focused crawler framework for the deep web developed in-house by Preferred.AI and open-sourced for public use.

Ween ran with the data collection module of the course.
With the zest of a seasoned instructor, he sure wasn’t a “crawler”.

To get students to internalize the lessons meaningfully, they worked in small groups on a realistic project involving a specific Web site.  Working with each group was a coach from Preferred.AI.  That the groups managed to build working crawlers by the end of Day 2 spoke of their effective teamwork, the coaches’ able guidance, and Venom‘s powerful features.

Aghiles (right)
coached Group 1
Max (2nd from left)
coached Group 2
Tuan (middle)
coached Group 3
Ween (middle)
coached Group 4
Jingyao (right)
coached Group 5
Andrew (standing, left)
coached Group 6

On Day 3, Max instructed the students on how to use machine learning techniques such as linear regression and logistic regression to build prediction models.  On Day 4, the project groups began training their models using the data they were collecting with the crawlers they’d built earlier.

Max dished out lessons on regression and satiated the students’ hunger to learn
by using chicken rice as an example.

In true SIS learning style, on Day 5 the student groups took turns presenting, and defending, their projects.  The team camaraderie was palpable.  The coaches were as anxious as their proteges.  In just 5 short days, they bonded.  In the post hoc feedback, 90% of the students rated the coaches as ‘Helpful’ or even ‘Awesome’, without whom the learning experience just wouldn’t be the same.

Group 1 predicted ratings based on restaurant reviews
Group 2 assessed how to price properties for sale
Group 3 predicted the price ranges food places
Group 4 built a model to price used cars effectively
Group 5 estimated rental prices of apartments
Group 6 categorized cars by types based on their features 

The projects were finally evaluated by Hady and a guest judge, Wu Huayu (VP, Data Science – DBS Bank).  Huayu had earlier kindly shared his expertise by giving a talk titled “Applications of Data Science in Industries”, a wide-ranging coverage of the history of big data and AI and how their applications touched various industries including banking, manufacturing, etc.


We awarded two prizes.  The Best Project was awarded to the group with the most creative, technically rigorous, impactful, and well-presented project. The Best Class Participation was awarded to the individual student who contributed the most to the class learning.

Presentation of the Best Project Prize, L to R: Hady (course coordinator), Group 4 (winner) comprising Jewelyn, Jeffery and Ronald from SP, Huayu (guest judge)
Presentation of Best Class Participation Prize, L to R: Hady (course coordinator), Max (instructor), Danial (winner) from SP, Ween (instructor)

Once the prizes were announced, the winners were finally revealed.  Yet there was no loser in sight.  After all, when learning takes place, we’re all winners.

UAI-2018 in Monterey

In early August 2018, Aghiles attended the international conference on Uncertainty in Artificial Intelligence (UAI), which took place in Monterey, California. Hereafter, he shares his UAI experience.


The conference was hosted in the InterContinental Hotel, and according to the organizers, this was the biggest UAI ever. The papers presented in the conference covered various current areas in machine learning and AI. Among the topics most represented were: Representation Learning (where our contribution falls into), Causal Inference, Variational Inference, Gaussian Process, Online and Reinforcement Learning.

The InterContinental, main entrance side.
UAI’18 opening words

Each accepted paper was presented as either an oral and/or poster. For the oral presentations, there was only one session at time, which was quite convenient as one could attend any talk of interest. I particularly enjoyed the daily poster sessions; they were highly attended and allowed for deep and enriching discussions/exchanges.

Aghiles with our poster on Probabilistic Collaborative Representation Learning

Our work accepted to UAI is entitled “Probabilistic Collaborative Representation Learning for Personalized Item Recommendation” and describes a new Bayesian model for jointly modeling user preferences and deep item  features learning from auxiliary information (such as items’ textual descriptions, images, contexts, etc.).


The conference would not have been a full experience without its banquet dinner at the beautiful Monterey Marina Bay Aquarium. It was another opportunity to meet and interact with people in a broader sense. I particularly retain two things from this evening. The first one, of course, is the excellent food :). The second one, is a discussion with a group of researchers working on causal inference, which allowed me to realized the importance to look into this huge underexplored field in Machine Learning.

Monterey Aquarium during the banquet

UAI’18 was also an opportunity to discover Monterey. Aside from a long and rich history, the things that I enjoyed most about Monterey were the cool-summer weather, fresh & delicious seafood, and all these defunct sardine-canning factories turned into bars, restaurants or shops.

Cannery Row, the site of several now-defunct Sardine canning factories
The Monterey Canning Company, now transformed into a shopping center
Monterey Harbor Area
Monterey Harbor Pier


IJCAI-18 in Stockholm

In July 2018, Hady traveled to Stockholm, Sweden for the International Joint Conference on Artificial Intelligence (IJCAI).  Here, he recounts his experience from the trip.


IJCAI-18 was probably the largest academic conference I ever participated in so far, with 2500 registered attendees.  This was 20% larger than the 2017 conference.  This pronounced growth and outsized congregation is one more sign of the rise (the return?) of Artificial Intelligence or AI.

The conference was held in the massive convention centre Stockholmsmässan

The conference organizer knew how to put up a show.  The scene-stealer during the opening ceremony was definitely the dancing couple of human and robot, gyrating harmoniously to the catchy beat.  Talk about what AI can do!

In the opening ceremony, we were treated to a spectacle of human-robot duet dance.
The conference has 710 papers, and a selective 21% acceptance rate. Singapore more than pulls its weight with 26 papers.

Our Preferred.AI group has 2 papers accepted to the conference.  Both were presented in the Learning Preferences or Rankings session.  The first paper “Modeling Contemporaneous Basket Sequences with Twin Networks for Next-Item Recommendation” by Trong, Hady, and Yuan explores the interaction between two behavioral streams that are occurring concurrently, such as clicking and purchasing on an e-commerce site, and how they can be modeled jointly to improve sequential recommendation.


The second paper “A Bayesian Latent Variable Model of User Preferences with Item Context” by Aghiles and Hady describes a novel graphical model based on Poisson factorization that incorporates item context information, such as which items are viewed together, in addition to  user-item interactions, to improve recommendations especially for users with more limited information.

DFN: Discordant Fraternal Network, a neural network for modeling the interaction between two sequence types of user actions
C2PF: Collaborative Context Poisson Factorization, a graphical model that incorporates item context for recommendation

In addition to the oral presentations in the session, we also got a chance to engage the audience in a poster session.  This allowed deep dives into specific issues, which would take more time and one-on-one discussions.  For instance, from a discussion with a researcher from a large online retailer, I learnt about how prevalent recurrent neural networks were in the company’s sequential recommendation models.

Hady with the two Preferred.AI posters on Contemporaneous Basket Sequences and Collaborative Context Poisson Factorization respectively

The two things that surprised me the most about Stockholm were the weather and the water.  While it was summer then, I was still struck by how a place so far north could be so warm.  No wonder the humidity is high, because water is everywhere!  I grew up with the notion that the neighboring  Indonesia was the largest archipelagic nation.  While that is probably still true in terms of area, little would I expect that the Stockholm archipelago actually have an even greater count of islands than Indonesia.

The view from the Stockholm City Hall, where one of the receptions was held
Ferry to Djurgården, heading to the social program at Skansen, the world’s oldest open-air museum

Prior to the trip, the thing I associated the most with Sweden was the Nobel Prize.  So a visit to Stockholm would not have been complete without experiencing the Nobel Museum.  While the museum has many artefacts connected to various prize winners over the years, the greatest find in my exploration was the cafe!  Some of the chairs have been signed by past prize winners. Well, I may not yet be able to say that I have stepped into their shoes, but now I could say that I once sat on their chairs :).

Nobel Museum, where I probably spent too much time in the museum shop 🙂
In the cafe, one of the chairs was suspended from the ceiling to highlight that the bottom of some chairs may have been signed by prize winners
Signatures of Barack Obama (Peace 2009) and Aung San Suu Kyi (Peace 1991) underneath one chair

KDD.SG Tutorial on Image Classification Using CNN (Materials)

Our appreciation to KDD.SG (Singapore Chapter of SIGKDD) and DSSG (DataScience Singapore) for jointly organizing and inviting us to deliver a public tutorial on June 9.  Tuan and Hady delivered a tutorial on image classification using convolutional neural networks, focusing on two applications, namely: face emotion recognition and visual sentiment analysis.

Hady opened the tutorial on Image Classification using CNN
Tuan explained the implementation of Multi Layer Perceptron on Tensorflow
The audience was actively involved and participating in the tutorial

For those who missed the tutorial, you may find the materials here for your own self-practice.  A video recording of the event can be found below.


SDSC – DSSG Data Science Meetup (Videos)

We are grateful to SDSC (Singapore Data Science Consortium and DSSG (DataScience Singapore) for organizing the June 7 meetup, and to the hundred attendees who gave us an opportunity to share some of our recent work.

After opening remarks by Caroline from SDSC, Hady gave the first technical talk titled “Modeling Preferences from Multi-Modal Data: A Deep Learning Exploration”.  The talk covered several works by our group in modeling user preferences from several modalities such as social networks, images, as well as sequences.



Focusing on modeling text, in particular about word embeddings and how sentiment infusion could improve the performance of word embeddings on several text classification tasks, Maksim gave a talk titled “SentiVec: Sentiment-Infused Word Embeddings”.



Rounding up the coverage of modalities, Aghiles gave a talk titled “C2PF: A Poisson Latent Factor Model of User Preferences with Item Context” on incorporating the effects of item context to improve recommendations.



You may find the slides of the three talks here.  It was a fruitful session, and we got a chance to meet many new contacts from academia and industry.  Looking forward to the next opportunity to get in touch.