Visualizing Movie Recommendations using Plotly and TigerGraph 🐯
Note: This will be a brief overview of the major components required to put together the Plotly Dashboard for TigerGraph’s Movie Recommendation Starter Kit. For a more comprehensive overview (including the code needed to run the dashboard), make sure to check out this Google Colab Notebook. For a visual demonstration, make sure to check out this YouTube Tutorial!
TigerGraph’s Movie Recommendation Starter Kit is rich with data. Comprising more than 27,000 movies, 138,000 reviewers, and over 2 million ratings, there are some truly fascinating conclusions we can extract from this Graph. However, being able to visualize this vast database is just as important as the information it contains.
In order to gain insights, traverse our Graph, and better understand its data, we must build an interactive dashboard.
Now, how will we accomplish this? First, we must assemble our toolkit…
- TigerGraph Cloud Portal: where our Graph will be hosted. This will contain our solution, data, and the queries used to traverse our Graph.
- Plotly Express: a high-level Python API which we’ll use to create figures
- Plotly Dash: used to create our dashboard containing our Plotly figures
Equipped with these three, we can dive in 😄!
Chapter 01 — Proper Prerequisites
First, we must create our TigerGraph solution. In order to do this, we can:
- Navigate to TigerGraph’s Cloud Portal
- Click on the blue “Create Solution” button
- Navigate to the Recommendations tab
- Select “Movie Recommendation Engine”
And now, we enter the details for our solution.
TG_HOST = "https://movie-plotly.i.tgcloud.io" # GraphStudio linkTG_USERNAME = "tigergraph" # This should remain the same...TG_PASSWORD = "movieplot" # Shh, it's our password!TG_GRAPHNAME = "MoviePlotly" # The name of the graph
Make sure that your subdomain name is unique! (two solutions cannot have the same subdomain at the same time, meaning yours will be different!)
Once this has been completed, click “Next” and then “Submit”. And voila, in a few seconds, our solution should go from “Uninitalized” to “Ready”.
Next, we must load our data via GraphStudio. To do this, we need to:
- Open GraphStudio by selecting it from the Applications tab
- Delete myGraph and create a new one titled “MoviePlotly”
Now, we can add our CSV files which will be used to populate our graph
- Select “Add Data File” and click on movies.csv
- Select “Add Data File” and click on ratings.csv
Before loading our data, we must first create a data mapping.
- Press “Map Data to Graph” and match each column to its attribute
- Click on “Publish Data Mapping” to save our changes
And finally, we can load our data!
- Navigate to the “Load Data” tab and click the play button
- Wait until both CSV files say finished
- Can check out our graph in the “Explore Graph” tab
Congrats! Our solution is now fully loaded and ready to be used. To establish a connection between our solution and our Python script, we can utilize the pyTigerGraph package. To begin, import pyTigerGraph.
!pip install -q pyTigerGraphimport pyTigerGraph as tg
After this, we can run the following lines to establish a connection.
conn = tg.TigerGraphConnection(host=TG_HOST, username=TG_USERNAME, password=TG_PASSWORD, graphname=TG_GRAPHNAME)conn.apiToken = conn.getToken(conn.createSecret())print(conn.gsql('''ls''', options=))
Make sure to replace the solution credentials with your own information!
And voila, we’re good to go!
Chapter 02 — Quant Queries
Now we can proceed with the data visualization!
First, we need to determine what information would be most beneficial to display on the dashboard. In order to retrieve this information from our Graph, we will write several queries in GSQL (TigerGraph’s Modern Graph Query Language). For more information, check out this great resource!
This dashboard utilizes nine queries in total, all of which can be found in this Colab Notebook. For the purposes of this blog post, we will select and dissect two of them in order to better understand the GSQL at work.
Query 01 — Frequency map (genres → counts)
This query returns a frequency map of rated genres vs. counts of movies rated in each for a given individual. The complete query is displayed below.
Let’s walk through it!
First, we create a MapAccum<STRING, INT> which we’ll use to store each genre as well as its count. Then, for each rated movie by the inputted individual, we simply add an entry and return the final MapAccum.
Entry addition done here: (ACCUM @@allGenres += (tgt.genres->1);)
Pretty straightforward! Let’s take a look at a more complicated one…
Query 02 — Similar Reviewers
This query returns a frequency map of similar reviewers. The map’s keys are person ID’s and its values are the number of movies in common.
For each movie that the inputted person has reviewed, if another individual has given that movie the same rating, the similarity count of the other individual is increased by one. It is essentially counts the number of movies that the two individuals have given the same rating to.
First, we create a MapAccum<VERTEX<movie>, DOUBLE> which we’ll use to store all movies rated by the inputted individual as well as the score of each. After this, we’ll create a MapAccum<VERTEX<person> INT>.
For each movie rated by the inputted individual, we will find all other reviewers of that movie. If the rating given by that individual is the same, then we can update the count of that individual by one, meaning that individual has one more movie in common with the inputted person.
Entry addition is done here: ACCUM IF abs((@@allRatings.get(m)) — reverse_rate.rating) < 0.001 THEN @@allReviewers += (p->1)
Not too bad 😎!
With our queries installed and ready to go, we can begin to use them in order to retrieve, process, and display the Graph’s data in useful visualizations.
Chapter 03 — Functioning Figures
First, we must import the packages needed for the Plotly Dashboard.
!pip install -q jupyter-dash!pip install -q dash-bootstrap-components!pip install -q dash_daq!pip install dash-extensionsfrom jupyter_dash import JupyterDashimport dashimport dash_tableimport dash_core_components as dccimport dash_html_components as htmlfrom dash.dependencies import Input, Output, Stateimport dash_bootstrap_components as dbcimport plotly.express as pximport plotly.graph_objects as goimport pandas as pd
Alright, we have all the libraries we need!
Now, we still need to create visualizations using the queries we’ve written above. The following functions (detailed further in the Colab Notebook) provide a link between the GSQL queries and the Dashboard by polishing the raw output of our queries. They turn our Maps into useful visuals.
Instead of tediously working our way through each line of code, we’ll be covering a few of the commonly-used figures. All code can be found here!
Bar charts are used extensively throughout dashboards! Here’s how we’ll create ours to visualize how the rating of a movie has changed over time.
First, we run one of our person-specific queries.
Towards the bottom half of our codeblock, we use the map outputted by the query to create two lists, one that stores all the years and another that stores the average rating of that movie in the given year.
Using these two lists, we can create a bar chart.
With Plotly, it’s as simple as that! As seen above, we’ve created two separate traces, one that displays bars and another that also displays a trend line.
Here’s the result!
Another equally important chart is the pie chart!
In order to visualize which genres are included in our Graph, we’ll use a pie chart to display the number of many movies each genre contains.
Once again, we begin by running one of our queries.
Next, we process the data received by filtering any incorrect entries and creating two lists, one to store all genres and another to store the number of movies that belong to that genre
And finally, we use these two lists to create our pie chart.
Here’s our colorful creation!
By hovering over each slice, we can see the number of movies in that genre in addition to the percentage of movies represented. Pretty neat 😊!
A more obscure but interesting figure is the Radar Chart. This type of figure allows for the visualization of multiple variables on different axes.
In our case, we will be using a radar chart to compare multiple individuals’ most commonly-viewed genres. For more info, check out this radar guide!
First up, we have a small helper function that returns a list of genres and the number of movies viewed in each of those genres for a given individual.
This helper function will be called for each of the most similar users for a given individual, as seen in the main function below.
Calling the “Person_SimilarReviewers” query allows us to create a map of the top five similar reviewers to a given individual. Their ID numbers as well as the number of movies they have given the same rating to are stored.
After this, we create our radar chart and add five separate traces, one for each of the top five similar reviewers. The “hovertemplate” attribute of each trace is simply the text which is displayed upon hovering over a given trace.
All put together, our radar chart is quite unique! Here’s the output when run with Person #1 as its inputted individual.
By hovering over each trace, we can learn more about similar reviewers.
Additionally, we can quickly see that these five individuals all enjoy Fantasy, Drama, Comedy, Adventure, Action, Thriller, and Sci-Fi 😄.
In addition to these charts, other visualizations that we’ll use include box and whisker plots, tables, and badges to display important numbers.
Combining these functions with our GSQL queries from above, we can begin to stitch together the layout and features of our final Dashboard!
Chapter 04 — Dazzling Dashboard
And at long last, with one stroke of our fingers, we’ll initialize our app.
app = JupyterDash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP, FONT_AWESOME], suppress_callback_exceptions=True)
Great and all… but there’s nothing really going on as of yet 😅.
We must now create the HTML needed for each of our app’s main pages: a sidebar, a general page, a movie-specific page, and a person-specific page.
Once again, all of the code needed to create the dashboard layout can be found in this Colab Notebook. Instead of tediously working through line-by-line, we’ll focus just on a few key components:
Component 1 — Sidebar & Callbacks
First up, we have our sidebar!
This sidebar will be displayed on all pages and allow users to easily navigate through our dashboard. Let’s dive in to each of its children components!
This first half contains the title, three gauges, and the navbar. The three gauges display the number of vertices and edges that belong to the graph. The navbar is used to navigate through the three pages of the dashboard.
In order to switch among the general, movie-specific, and person-specific page, we must utilize a callback. Our callback function is shown below.
As shown, depending on the pathname, different pages’ contents are returned and displayed. If an incorrect path is entered, an error is returned!
Continuing on with our sidebar, the second half contains a brief description, several external links, and finally a TigerGraph logo and link.
Pretty straightforward! Let’s see how it turned out…
Sidebar, check ✅.
Component 2 — Search Bar & Callbacks
Next up, our search bar.
For both our movie-specific and person-specific pages, the user should be able to enter the ID number of either a movie or a person. In order to facilitate this input, we need to create a search bar and the appropriate callbacks.
As shown above, our search bar is simply a Dash card that contains a title, a dcc.Input component, a submit button, and the Plotly Dash logo.
The rest of the code simply helps with formatting, arranging each of the four core components in terms of spacing, layout, margins, color, etc.
In order to use the input entered by the user, we must write a callback.
The output of this callback is the content of the movie-specific page, the input is the button of the search bar (whether it’s been pressed or not), and the state used is whatever has been entered into the dcc.Input component.
If the submit button has been pressed, send the value to our function which renders the content of the movie-specific page.
Here’s a few snippets of our search bar in action!
Component 3 — The Person-Specific Page
You may be asking yourself,
Woah! Where did that absolutely dazzling page come from in the GIF above?
Well, that’s where our final component comes into play: each of the three pages themselves! In this section, we’ll break down the layout of the Person-Specific page. However, the same principles apply for both the General and the Movie-Specific pages. Their source code can be found in this Notebook.
Here’s the first half of what the Dash_GetPersonPage() returns.
Similar to the sidebar, the Person-Specific page contains several different elements. First, we have the search bar from before. Beneath that, we have a row containing the title and several metrics including the number of reviews, the number of ratings, and the accuracy of the inputted individual. After this, we have two columns: the least liked and most liked movies of that person.
Underneath the two columns, we have our radar chart and histogram from before. Each of these two figures has a title and description above them.
All put together, our Person-Specific page looks quite stunning 🤩.
With all of our components and callbacks in place, we can finally awaken our slumbering giant. With just one incantation, it rises!
And here, in all of its glory, is our Movie Starter Kit Dashboard 😍.
We have the main page.
We have the movie-specific page.
And finally, we have the person-specific page.
Pretty impressive visualization! All put together in Python 😊.
Congratulations on creating a dashboard for your TigerGraph solution 🥳!
Here a few cool ideas and extensions you can try implementing:
- Add a query to map movie titles to their ID numbers. Use this to search for movies by their titles (can try adding autocomplete as well!)
- Create a genre-specific tab. Can include information like number of movies, average rating, top movies, etc.
- Create a date-specific tab. Can add statistics including the top movies, genres, and reviewers of the selected year/month
For more Plotly + TigerGraph, here’s a few visualizations to check out!
For any help or assistance, make sure to check out the following resources:
With Plotly + TigerGraph, there’s truly an infinite number of Graphs and visualizations that can be made…
Happy exploring 🎉🎉🎉!
P.S. Feel free to submit your own demos via TigerGraph’s Community Contribution Program and earn your stripes!