00:01
Alright, so for this problem, we are going to start off with making sure that we have pandas imported.
00:06
So the standard there is to import pandas as pd just because it makes it a lot more convenient to reference pd rather than pandas every time.
00:15
Make sure that you have the pandas library installed, of course.
00:18
First thing that we're going to do after that is to read the data into a data frame.
00:25
Also i'm going to be making sure that i comment through all of this.
00:28
It's a good idea to comment in your own code as well.
00:31
One second here.
00:32
Okay, so we read the data into a data frame.
00:35
The way that we'll do that is i'll create a data frame called df, put in pd .read .csv.
00:43
For this dataset, it's vgsales .csv.
00:47
One thing that i'll note is that i have my, clearly i'm working in a jupyter notebook here, you don't necessarily need to do that, that's just the sort of standard for data science type things.
00:58
But of course you can do this with anything with python, just make sure that you are actually putting in the appropriate path to the file.
01:04
Here i have my jupyter notebook saved into a folder with the file inside the same folder so i don't need to worry about pathing to it.
01:13
So if i just execute the code now and then say df, we can see that, okay, we've loaded in the data frame as we want.
01:22
We can see that we have rank, name, platform, year, genre, publisher.
01:26
We want to find the sum of the global sales for each one of the platforms.
01:31
So the first thing that i'm going to do, now i'll note that there are many different ways to solve this problem, this is just the approach that i've come up with here.
01:39
The first thing that i'm going to do is create a list of the individual platforms.
01:44
So the way that we'll do that is create, well, first of all, a variable called platforms, populated with the unique values from the platform column of our data frame.
01:56
So we do df platform .unique.
02:03
We do that and then call up the platforms variable, we can see we have we, nes, gb, and so on.
02:10
So we have our array of the different platforms.
02:13
Once we do that, or once we have done that, what i'll do is basically, if we think back to the data frame itself, we have everything sort of mishmashed together.
02:23
What we want to do is make sure that we are going to be summing over each one of the platforms separately.
02:30
So what i do, or what i have done, is i create a dictionary for the individual platforms.
02:39
So i'll say platforms data, and that's an empty, or i'll start it off as an empty dictionary, and i'll populate it, populate the dictionary with a for loop...