After a successful two-year-long partnership with Google, Twitter is now extending the agreement to further use Google’s Cloud service for its data analysis. Under this, Twitter uses many of the Google Cloud tools like BigQuery, Dataflow, NoSQL, and Machine Learning to analyze the hundreds of petabytes of user data every day.
Twitter Uses Google Cloud For Data Processing
Twitter is a platform having millions of users posting billions of tweets every day. This amounts to large dumps of data regarding every user’s activities, which may not be useful for everyone.
But Twitter uses this data to better its services and make its platform appealing as ever. Thus, it has to pick a partner that can handle and process this data effectively.
And it shook hands with Google Cloud in 2018 and used it for processing its cold data and Hadoop clusters to date. It has now announced extending this agreement, which now includes moving offline data for processing and analysis using machine learning and other tools.
Twitter has hundreds of petabytes of data coming from millions of users every day. This includes over ten data points like their activity of likes, tweets, retweets, etc. Processing this could be a hefty task, which Google Cloud, with its tools, can do effectively. This includes using BigQuery to store data, Bigtable for NoSQL, Dataflow for streaming data analysis, etc.
This automatic processing and analysis reduces the workload on Twitter’s in-house engineers and data scientists, who had to do most of the works like extraction, conversion, loading, and modeling of data through custom-made programs. Before Twitter, Ford has signed up to use Google’s Cloud for its data processing last week.