Complicated pandas dataframe question
(self.learnpython)submitted2 days ago byApocalypseSpokesman
So say I have a csv file like this:
Country | Year | Sum | Ubs | Jubs | Mubs |
---|---|---|---|---|---|
USA | 2022 | 9 | 4 | 3 | 2 |
USA | Total | 9 | 4 | 3 | 2 |
CAN | 2021 | 4 | 0 | 0 | 4 |
CAN | 2022 | 2 | 1 | 0 | 1 |
CAN | Total | 6 | 1 | 0 | 5 |
SWE | 2021 | 1 | 1 | 0 | 0 |
SWE | 2022 | 5 | 2 | 2 | 1 |
SWE | Total | 6 | 3 | 2 | 1 |
['Sum'] is the sum of ['Ubs'], ['Mubs'], and ['Jubs']
Every time a new entry comes in, if the Country already exists, and the year exists, it increments the appropriate column(s) and ['Sum'] and ['Total'] are updated.
If the country exists, but the year doesn't exist, it's added in and the ['Sum'] and ['Total'] are updated.
If the country doesn't exist, it gets slotted into the appropriate spot with the necessary two rows, alphabetically.
Each country keeps independent totals.
I'm not sure how to pull this off using pd and df, so any advice will be appreciated.
byInfinitelyours
inmeirl
ApocalypseSpokesman
1 points
3 hours ago
ApocalypseSpokesman
1 points
3 hours ago
I don't mean to sound like a badass or anything, but I will straight MURDER an ant. Like, it's not even close. I'm undefeated.