International and fun insights + visualisations using Google trends data and Python
Any claim which is not backed by data is difficult to digest. Be it any vertical or industry, it’s easy to convince yourself, your stakeholders and your team mates if your arguments are backed by data.
~Apurav Chauhan 😋
History — Need of visualisations
There have been countless times when as an Engineering head, I was put in a situation to decide between one or the other things. And working in a dynamic industry there is a highly likely chance that a choice what seem to be best few years back might have been replaced by a new choice.
Back in the year around 2010+ or so, the de facto choice for building a search platform was to go with Lucene powered Apache SOLR. You went through the docs, stackoverflow read some comparisons and got to the conclusion that yes it was the right choice.
Now Imagine you were to build a search platform in 2019. Would you go with the research that you did in 2010? No! You would do your homework again and check what changed and what tech is the community across the world is using. A curious mind will be interested to know data behind this information like when did this all changed to get deeper insights into the technology he will be choosing. Below is a static graph from google trends showing the usage of above two for US
I have found that Google trends answers such questions upto an extent. Club the data about what people are doing across countries and you can predict many cool things.
What if we could club different country data for comparisons and get deeper insights to make wise decisions? All we need is a small utility that could aggregate different country based data and make different visualisations for easy insights. With this idea in mind and to be able to make wise decisions when it came to simple comparisons, I made a python utility to aggregate and convert this data into animated bar chart visualisation.
Code is written in python and the v1 utility expects the downloaded google trends data in your system.
- Visit trends.google.com and enter your different search terms that you want to analyse.
- Change the country or time duration for your analysis.
- Download the csv using the download button as seen in red marked area in below screenshot
- Save the file under country name like US.csv and edit this file to remove the first two junk lines in the file.
After removing the top two junk lines, this is how your country files should look like:
Now we have written a small utility that transforms and aggregates this data so that it can be made into fun visualisation. Currently the output data formats is as per HighChartsJS library. If you want to integrate with any other library, you may tweak this code.
For comparing Augmented reality and Artificial intelligence across different countries in different time quarters, the data read from India, US, China, Germany and Canada in same order will be parsed on time basis in this format
To run the python program, use the below command inside the downloaded repo (assuming python3 is installed in your system):
It will prompt you to provide:
- data folder: where csv files in above mentioned format are kept
- date/time column: default column containing time/date segmentation
- output file name: name of html file to create your visualisation
And you will see this output as:
A sample insight that can be clearly seen is, when it comes to innovation in software domain, US was leading in early years. China slowly picked up on the race to become the market leader. However, India was always left behind in the race. You can check that a language first becomes obsolete in US/China and then years after following, India catches up. Today, when the world is innovating with Python, India is still playing with Java primarily. Which further means that while the world has started innovating towards ML, AI, India might take more years to catch the counterparts.
Python utility to create beautiful and interactive visualisation for Google Search trends …
Any feedback, tweet or find me at linkedin handle : @apuravchauhan