Learn Pandas Via Usecases — Part 2

Learn Python Pandas Via Usecases — Part 2

Use cases open up more functionalities

Learn Pandas Via Usecases — Part 2

Bhavani Ravi

In the last blog, I hope I have sold you the idea that Pandas is an amazing library for quick and easy data analysis and it’s much easier to use than you thought. If you have not read my first blog about Pandas, please go through it before you move forward.

Oops !! We missed Some Data

In the last blog, we saw basic Dataframe operations using sample sales data. Let’s assume you are a manager leading a sales team, and you were all happy about the sales trajectory and the pivot representation of the data you learned to create from ourlast blog.

import numpy as np
df.pivot_table(index=["Country"], 
               columns=["Region"], 
               values=["Quantity"], 
               aggfunc=[np.sum])
Learn Pandas Via Usecases — Part 2
Learn Pandas Via Usecases — Part 2
A simple pivot table

That’s when you realize you have missed sales data of a particular quarter
because it was lost in one of the spreadsheets. Now, what do you do? You already have a report ready to go. How can you incorporate the new data into the current pivot representation without major changes?

Learn Pandas Via Usecases — Part 2
Learn Pandas Via Usecases — Part 2

If you see, the pivot table is constructed with a single Dataframe df, somehow if we can find a way to feed our new data into the df
then we can just re-run the pivot code and voila!! we will get the report again.

So here are the steps we are going to follow,

1. Load the new spreadsheet data into a new Dataframe

df2 = pd.read_csv("data/Pandas - Q4 Sales.csv")
df2.head()

2. Combine two Dataframe into a single df
object,

Using concat

Pandas Concat
method concatenates the contents of multiple Dataframes and creates a new Dataframe.

The axis
param of the method enables you to concatenate data along rows or columns

result_df = pd.concat([df, df2], axis=0, sort=False)
# axis = 0, append along rows, 
# axis = 1, append along cols
result_df.tail() # tail is similar to head returns last 10 entries

Using append

Unlike concat
, the append
method adds up data to an existing dataframe instead of creating a new Dataframe. Also, you can notice that we don’t supply any axis parameter here since append method only allows adding new entries as rows.

result_df = df.append([df2],sort=False)
result_df.tail()

If you take a closer look, in both cases, the data frames that need to be combined are supplied as a python list [df1, df2].
This implies that we can combine as many Dataframes as we want

Learn Pandas Via Usecases — Part 2
Learn Pandas Via Usecases — Part 2

3. Re-run the pivot code

pivot = result_df.pivot_table(index=["Country"], 
                              columns=["Region"], 
                              values="Quantity")
Learn Pandas Via Usecases — Part 2
Learn Pandas Via Usecases — Part 2

Charts are better than tables

You have a couple of hours for your final meeting. Your presentation is concrete, your sales are good but still, something is missing. Charts.
For a management person who was so used to spreadsheets charts, leaving them behind is not a good idea. But, we have a short time to go back to spreadsheets, don’t we? Worry not, Pandas comes with a built-in charting framework which lets you draw graphs of our pivot representation

Learn Pandas Via Usecases — Part 2
Learn Pandas Via Usecases — Part 2

Perfection

As a person who was known for your perfection something doesn’t sit well in you. One of the tabular representations that you have created has unnecessary information that doesn’t interest your management, and a couple of columns have names that are used internally in your company and will not ring any bell to the management.

Worry not, we can do it all in one shot
and pretty quick. In pandas terms, we call this method chaining.

Method chaining enables you to perform a various transformation on the same data without storing the intermediate result.

  1. Explicit is better than implicit hence let’s
    rename “Total” to “Total Sales”
  2. We don’t need the date of purchase just the
    year

    and
    quarter
  3. We don’t need the requester of purchase, Salesperson, and Date of purchase. So let’s
    drop it.
result_df.rename({"Total": "Total Sales"}, axis=1)\
         .assign(Quarter=result_df['Date of Purchase'].dt.quarter, \
                 Year=result_df['Date of Purchase'].dt.year) \
         .drop(["Requester", "Sales Person", "Date of Purchase"], axis=1).head()
Learn Pandas Via Usecases — Part 2
Learn Pandas Via Usecases — Part 2

One Last Thing

With that, our final report looks good and guess what? Your management is not only happy about your sales this year but also excited about your new found love for Pandas, but there is just one last thing remaining, you need to send the final data as a CSV back to your management. But worry not we have pandas to do it for you.

result_df.to_csv(path_or_buf="Export_Data.csv")

An “Export_Data.csv”
file would be created in your current path which you can happily send to your management as an email attachment.

As you rest back on your seat, you want to automate the pandas experiment that you just did for the future sales reports. Thankfully, you have an intern who is joining you in a couple of days. It will be a great project for him to pick it up. Something in me tells that things aren’t going to be as easy as it was for you. which we will see in the next blog
“What’s wrong with Pandas?”

Learn Pandas Via Usecases — Part 2

Did the blog nudge you to deep dive into pandas?

Hold
the “claps” icon and give a shout on

twitter

.

Follow
to stay tuned on future blogs.

Learn Pandas Via Usecases — Part 2
Learn Pandas Via Usecases — Part 2
Preview image with title

Via hackernoon.com

Ref. https://hackernoon.com/@bhavaniravi, https://hackernoon.com/python-pandas-tutorial-92018da85a33, https://docs.google.com/spreadsheets/d/1VJNvRycV4T–Zjq915b0TCeo1C0FZZx9xaUdU2OJ9Fk/edit#gid=1973645975, https://twitter.com/@bhavaniravi, https://union-click.jd.com/jdc?d=ZKPlW2

READ THIS:

发表评论

电子邮件地址不会被公开。 必填项已用*标注