GVATE

Learning Analytics (or Google Analytics)

google analytics

We use Google Analytics to track data about our visitors and what they do on our sites. 

Google Analytics at scale 

Google Analytics will only return 500,000 rows of data for any query you send it (with a few exceptions). If you make a request where the result covers more than this amount of data, Google Analytics will take a sample of 500,000 rows, and multiply it up as necessary.

 

The Google Analytics web interface is powerful and significant for exploring your data, getting a rough idea of the numbers, or quickly sharing a report with colleagues. It’s much easier to do this via the Core Reporting API. The Query Explorer is a relatively friendly way to get started constructing queries, and it’s straightforward to go from there to grabbing data using Python, or another programming language.

 

Since you have to stitch these slices of data back together locally, you might as well store them in a central place where you can query and use them later. It also means that you can run complex queries like “show me the 100 posts with the most pageviews over the past month within a specific set of categories”, or “get the most popular post from each instructor that has published at least five posts.” It’s challenging to do this with Google Analytics alone since it doesn’t understand our concepts of “category” or “instructor.” Another advantage of having everything in a single database is that it’s a lot easier to get it into external analysis and visualization tools like Tableau.

 

In Google Analytics, each reporting view has a limit of 10,000 API requests per day: that’s the maximum number of times I can ping it to get some data via the API. This way, we can query the Code view when we want data about Code posts, the Web Design view when we want data about Web Design posts, and so on, and make the best use of the 50,000 query limit instead of being restricted to 10,000.

 

This massively reduced the number of questions to make overall, which sped things up and meant I wasn’t burning through so much of our Google Analytics API quota.

 

Google Analytics lets you submit up to ten API requests per second so that you can submit 50 offers within the five seconds it takes to process your first one. This means that big queues of claims can be processed 50 times faster than if you made the requests in serial, which turns a 24-hour wait into a lunch-break delay.

 

If (or, instead, when) your code crashes, your internet connection drops, you use up all your Google Analytics API allowance, or something else goes wrong 90% of the way through downloading a large set of data, you do not want to have to restart the download process. Make sure you’re saving the data to disk as you download it (rather than just storing it in memory), and make sure you can skip to wherever you left off when you start the download again.

 

Google Analytics API and the way that our database is structured. It means that you can create new sets of instructions without touching the underlying code that executes them, and you can improve the underlying code without worrying about messing up any of the regular data downloads.

Exit mobile version