How to Use Python Pandas Pivot Table for Data Presentation and Analysis

Including Cython, Jython, IronPython, PyPy, Django framework, and interpreters: Ipython, IPython Jupyter/Notebook, CPython


Post Reply
User avatar
Eli
Senior Expert Member
Reactions: 183
Posts: 5334
Joined: 9 years ago
Location: Tanzania
Has thanked: 75 times
Been thanked: 88 times
Contact:

#1

Pandas pivot table (pivot_table) is a very powerful and useful tool for data presentation and analysis. If you understand your data and the questions you want to answer, panda's pivot table can help you carry out your data tasks and achieve your goal very quickly.

Let's use this excel data and adopt this article to see panda's pivot table in action. We will mostly be using the module pandas DataFrame.xs.

Now everything is compiled in a single code below, uncomment (remove # before "print(pt)) to check the output in every stage from top - bottom. You do not need to set up your local environment to run the code, run it here.

  1. #Import the required libraries
  2. import pandas as pd
  3. import numpy as np
  4. import urllib.request
  5. #print("This is Pandas version:", pd.__version__)
  6.  
  7. #Download and read the data into DataFrame
  8. urllib.request.urlretrieve("https://pbpython.com/extras/sales-funnel.xlsx", "sales-funnel.xlsx")
  9. df = pd.read_excel("./sales-funnel.xlsx")
  10. #print(df)
  11. #print(df.head())
  12.  
  13. #Let’s define the status column as a category and set the order we want to view
  14. df["Status"] = df["Status"].astype("category")
  15. df["Status"].cat.set_categories(["won","pending","presented","declined"],inplace=True)
  16.  
  17. #Pivoting the data
  18.  
  19. #The simplest pivot table must have a dataframe and an index. Let’s use the Name as our index
  20.  
  21. pt = pd.pivot_table(df,index=["Name"])
  22. #print(pt)
  23.  
  24. #Let's put multiple indices via list
  25. pt = pd.pivot_table(df,index=["Name","Rep","Manager"])
  26. #print(pt)
  27.  
  28. #Let's change index and look by Manager and Rep
  29. #You can see that the pivot table is smart enough, it aggregates the data and summarizes it by grouping the reps with their managers
  30. #This is the power of pandas pito table
  31. pt = pd.pivot_table(df,index=["Manager","Rep"])
  32. #print(pt)
  33.  
  34. #For this purpose, we do not need Account and Quantity columns,
  35. #Let’s omit them by explicitly defining the columns we need using the values field
  36. pt = pd.pivot_table(df,index=["Manager","Rep"],values=["Price"])
  37. #print(pt)
  38.  
  39. #The price column automatically averages the data, we can however count or sum.
  40. #Addition is achieved using aggfunc and np.sum
  41. pt = pd.pivot_table(df,index=["Manager","Rep"],values=["Price"],aggfunc=np.sum)
  42. #print(pt)
  43.  
  44. #aggfunc can take a list of functions such as numpy.mean function and len to get a count
  45. pt = pd.pivot_table(df,index=["Manager","Rep"],values=["Price"],aggfunc=[np.mean,len])
  46. #print(pt)
  47.  
  48. #Columns vs Values
  49. """If we want to break down sales by the products, the columns variable allows us to define one or more columns.
  50. columns optionally provide an additional way to segment the actual values of your interest.
  51. The aggregation functions are applied to the list of values"""
  52. pt = pd.pivot_table(df,index=["Manager","Rep"],values=["Price"],
  53.                columns=["Product"],aggfunc=[np.sum])
  54. #print(pt)
  55.  
  56. #We can remove the NAN's as seen after printing the pivot table above by setting fill_value to 0.
  57.  
  58. pt = pd.pivot_table(df,index=["Manager","Rep"],values=["Price"],
  59.                columns=["Product"],aggfunc=[np.sum],fill_value=0)
  60. #print(pt)
  61.  
  62. #Let's add quantity to the values list
  63. pt = pd.pivot_table(df,index=["Manager","Rep"],values=["Price","Quantity"],
  64.                columns=["Product"],aggfunc=[np.sum],fill_value=0)
  65. #print(pt)
  66.  
  67. #You can move items to the index to get a different visual representation.
  68. #Let's remove Product from the columns and add to the index
  69. pt = pd.pivot_table(df,index=["Manager","Rep","Product"],
  70.                values=["Price","Quantity"],aggfunc=[np.sum],fill_value=0)
  71. #print(pt)
  72.  
  73. #Let's display some totals, to achieve this we use margins=True
  74. pt = pd.pivot_table(df,index=["Manager","Rep","Product"],
  75.                values=["Price","Quantity"],
  76.                aggfunc=[np.sum,np.mean],fill_value=0,margins=True)
  77. #print(pt)
  78.  
  79. #Let’s move the analysis up a level and look at the data pipeline at the manager level
  80. #Notice how the status is ordered based on earlier category definition
  81. pt = pd.pivot_table(df,index=["Manager","Status"],values=["Price"],
  82.                aggfunc=[np.sum],fill_value=0,margins=True)
  83. #print(pt)
  84.  
  85. #It is very handy to pass a dictionary to the aggfunc so that we can perform different functions on each of the values we select
  86. pt = pd.pivot_table(df,index=["Manager","Status"],columns=["Product"],values=["Quantity","Price"],
  87.                aggfunc={"Quantity":len,"Price":np.sum},fill_value=0)
  88. #print(pt)
  89.  
  90. #We can provide a list of aggfuncs to apply to each value too
  91.  
  92. tb = pd.pivot_table(df,index=["Manager","Status"],columns=["Product"],values=["Quantity","Price"],
  93.                aggfunc={"Quantity":len,"Price":[np.sum,np.mean]},fill_value=0)
  94. #print(tb)
  95. tb
  96.  
  97. #Advanced Pivot Table Filtering
  98. #You can filter DataFrame generated data using your standard DataFrame functions
  99. #Suppoes we want to look at just one manager:
  100. pt = tb.query('Manager == ["Debra Henley"]')
  101. #print(tb)
  102.  
  103. #We can look at all of our pending and won deals
  104. pt = tb.query('Status == ["pending","won"]')
  105. print(pt)

0
TSSFL -- A Creative Journey Towards Infinite Possibilities!
Post Reply
  • Similar Topics
    Replies
    Views
    Last post

Return to “Python Programming”

  • Information
  • Who is online

    Users browsing this forum: No registered users and 7 guests