KoBo Collect - KoBoToolbox - Google Sheet - TSSFL Stack - Data Analysis

User avatar
Eli
Senior Expert Member
Reactions: 183
Posts: 5303
Joined: 9 years ago
Location: Tanzania
Has thanked: 75 times
Been thanked: 88 times
Contact:

#1

KoBoToolbox is a free and open-source suite of tools for field data collection, very suitable for use in challenging environments. This post intends to teach you how to use KoBoToolbox to collect and submit the data to the KoBo platform using an android device and web-based data collection forms. KoBoToolbox offers two major options for data collection, they are, collecting data using KoBoCollect Android App and data collection using Enketo web forms.

Today we are focusing on Collecting Data with KoBoCollect on Android, and also include all available options for which KoBo forms can be used to collect data, and finally import data from KoBoToolbox into a Google spreadsheet. We further show how to use koboextractor Python module to read data directly from KoBoToolbox and visualize them with various charts.

The procedure to achieve data collection using KoBoCollect Android App goes as follows:

1. Go to KoBoToolbox website and create an account. Confirm/activate the account you created by clicking the link sent to your email. Clicking the link will automatically log you in, otherwise, log in here.

2. After logging in, click the New button on the top left corner to create a new form for your project. You will be presented with four options, choose "Build from scratch". Next, fill in the form details as appropriate.

Image

3. Click the plus (+) sign on the left, write a question, and then click "+ADD QUESTION" on the right to choose a response type from the menu. Let's create a simple form that collects a person's name, age, and height.

Image

Image

Image

4. Select settings to choose whether the question is mandatory or not, or to add skip logic -- that's the condition whether to display the next question or not depending on the previous response. Questions can be grouped and skip logic added to them depending on relevance. You can drag and rearrange questions.

5. Apply validation: a condition for which a certain question/response is valid. Include an error message if the response is not acceptable.

6. Finally, deploy your form. There are multiple options to deploy the KoBo form: we will choose the android application option, and follow these instructions:
  1. Install KoboCollect on your Android device.
  2. Click on three vertical dots (...) to open settings.
  3. Enter the server URL https://kc.kobotoolbox.org and your username and password
  4. Open "Get Blank Form" and select this project.
  5. Open "Enter Data."
Image


Android Application: Configure Kobo Collect, Upload Form, and Collect Data

The whole process is shown pictorially as follows: General Settings -> Server -> Type -> KoBoToolbox, then Get Blank Form, and finally Fill Blank Form:

Image

Image

Image

Image

Image

Image

Image

Image

Image

Image

Image

Image


Other Options to Collect and Submit Data to KoBoToolbox

A. Online-Offline (multiple submission)

This allows online and offline submissions and is the best option for collecting data in the field:




B. Online-Only (multiple submission)

This is the best option when entering many records at once on a computer, e.g. for transcribing paper records.




C. Online-Only (single submission)

This allows a single submission, and can be paired with the "return_url" parameter to redirect the user to a URL of your choice after the form has been submitted




D. Online-Only (once per respondent)

This allows your web form to only be submitted once per user, using basic protection to prevent the same user (on the same browser & device) from submitting more than once




E. Embeddable version





F. View Only





Importing Data From KoBoToolbox Into Google Sheet

We first need to install API Connector: https://gsuite.google.com/marketplace/a ... 5804724197 in order to be able to connect and import data from APIs to Google Sheets. After that, then we can follow these steps:

1. Get your KoBo API key by navigating to https://kf.kobotoolbox.org/token/. This is a string in a double quote that follows after "token":

2. Create your KoBo API request URL, currently the URL is: https://kf.kobotoolbox.org/api/v2/assets.json


3. Pull KoBo API Data Into Google Sheet:

- Open up Google Sheets and click Add-ons > API Connector > Open.
- In the Create screen, enter the API Request URL we just created in 2 above

Image

4. Under Headers, enter two sets of key-value pair like this:

- Authorization: Token Your_API_Token
- Accept: application/json

Only replace YOUR_API_TOKEN with the token you got in 1 above.

Image


5. We don’t need OAuth2 authentication so just leave that set to None. Create a new tab and click ‘Set current’ to use that tab as your data destination.

6. Name your request and click Run. A moment later you’ll see a list of your Kobo assets (forms/projects, questions, blocks, templates, collections) populate your sheet.

Image

We named our request Import From KOBO and this is what we get:




See reference for importing data into Google sheet. See also discussion on KoBo Community.

We can download our data from KoBoToolbox, for example in CSV format, upload it to DropBox, and then retrieve the data for analysis with TSSFL Stack.
Attachments
kb3.png
kb4.png
kb5.png
s1.jpg
s2.jpg
s3.jpg
s4.jpg
s5.jpg
s6.jpg
s7.jpg
s8.jpg
s9.jpg
s10.jpg
s11.jpg
s12.jpg
0
TSSFL -- A Creative Journey Towards Infinite Possibilities!
User avatar
Eli
Senior Expert Member
Reactions: 183
Posts: 5303
Joined: 9 years ago
Location: Tanzania
Has thanked: 75 times
Been thanked: 88 times
Contact:

#2

See how the deployed forms look like:

Image

Image
Attachments
Deployed.png
android_app.png
0
TSSFL -- A Creative Journey Towards Infinite Possibilities!
User avatar
Eli
Senior Expert Member
Reactions: 183
Posts: 5303
Joined: 9 years ago
Location: Tanzania
Has thanked: 75 times
Been thanked: 88 times
Contact:

#3

We can read KoBo data saved in DropBox as below:

  1. import urllib.request
  2. import pandas as pd
  3.  
  4. urllib.request.urlretrieve('https://www.dropbox.com/s/ydvnsh33sf9rq5e/KoBo.csv?dl=1', 'KoBo.csv')
  5.  
  6. df = pd.read_csv("KoBo.csv")
  7. #Renaming columns: https://stackoverflow.com/questions/11346283/renaming-column-names-in-pandas
  8. #df2 = df.rename({'What is your name?': 'Name', 'How old are you?': 'Age', 'How tall are you?': 'Height'}, axis=1)  # new method
  9. #df.columns = df.columns.str.replace('What is your name?', 'Name')
  10. print(df)

0
TSSFL -- A Creative Journey Towards Infinite Possibilities!
User avatar
Eli
Senior Expert Member
Reactions: 183
Posts: 5303
Joined: 9 years ago
Location: Tanzania
Has thanked: 75 times
Been thanked: 88 times
Contact:

#4

Let's visualize the Google Sheet data below imported from KoBoToolbox:

  1. #Plot some graph
  2. #Import required libraries
  3. import gspread
  4. import urllib.request
  5. import numpy as np
  6. import matplotlib.pyplot as plt
  7. import seaborn as sns
  8. import pandas as pd
  9.  
  10. """
  11. urllib.request.urlretrieve("https://www.dropbox.com/s/mqsyfuetv8potvd/credentials.json?dl=1", "credentials.json")
  12.  
  13. gc = gspread.service_account(filename="credentials.json")
  14. sh = gc.open_by_key("1BC48PKPZW71AC6hOn1SNesvZk1PoBjE4wXwl1YlWFpY") #Open spreadsheet,
  15. #the spreadsheet ID starts with 1019ke.... between "" in the line above
  16. """
  17. #Alternative
  18. #If your file only has one sheet, replace sheet_url
  19. sheet_url = "https://docs.google.com/spreadsheets/d/1GQk7WzixWPs7iyEwAEm63cMcW6dDX_MzEq0eUKucgEA/edit#gid=0"
  20. url_1 = sheet_url.replace('/edit#gid=', '/export?format=csv&gid=')
  21.  
  22. data = pd.read_csv(url_1)
  23.  
  24. #worksheet = sh.sheet1
  25.  
  26. #Define variables
  27. var1 = "results.Name"
  28. var2 = "results.Age"
  29. var3 = "results.Height"
  30.  
  31. #age = worksheet.col_values(3)[1:]
  32. age = data[var2]
  33. print("Ages:", age)
  34. #height = worksheet.col_values(4)[1:]
  35. height = data[var3]
  36. print("Heights:", height)
  37.  
  38. #Pandas is extremely very useful for Google spreadsheets
  39. #Convert the json to Pandas dataframe
  40. #Get all data records as dictionary
  41. #data = worksheet.get_all_records()
  42. #df = pd.DataFrame.from_dict(data)
  43.  
  44. #Let's get some statistics
  45. #age_arr = np.array(age)
  46. #age_array = age_arr.astype(float)
  47. #h_arr = np.array(height)
  48. #h_array = h_arr.astype(float)
  49.  
  50. print("Average Age:", np.mean(age))
  51. print("Mean Height:", np.mean(height))
  52. print("Minimum and Maximum Age:", np.min(age), np.max(age))
  53. print("Minimum and Maximum Height:", np.min(height), np.max(height))
  54.  
  55. #Let's visualize
  56. #Graph styles and font size
  57. sns.set_style('darkgrid') # darkgrid, white grid, dark, white and ticks
  58. plt.rc('axes', titlesize=18)     # fontsize of the axes title
  59. plt.rc('axes', labelsize=14)    # fontsize of the x and y labels
  60. plt.rc('xtick', labelsize=13)    # fontsize of the tick labels
  61. plt.rc('ytick', labelsize=13)    # fontsize of the tick labels
  62. plt.rc('legend', fontsize=13)    # legend fontsize
  63. plt.rc('font', size=13)          # controls default text sizes
  64.  
  65. #sns list of color plettes
  66. #print(sns.color_palette('deep'), sns.color_palette("pastel"), sns.color_palette("Set2"))
  67.  
  68. #Let's Read Data from Google Sheets into Pandas without the Google Sheets API
  69. #Useful for multiple sheets
  70. sheet_id = "1GQk7WzixWPs7iyEwAEm63cMcW6dDX_MzEq0eUKucgEA"
  71. sheet_name = "Sheet1"
  72. url = f"https://docs.google.com/spreadsheets/d/{sheet_id}/gviz/tq?tqx=out:csv&sheet={sheet_name}"
  73.  
  74. #If your file only has one sheet, replace sheet_url
  75. #sheet_url = “https://docs.google.com/spreadsheets/d/1XqOtPkiE_Q0dfGSoyxrH730RkwrTczcRbDeJJpqRByQ/edit#gid=0"
  76. #url_1 = sheet_url.replace(‘/edit#gid=’, ‘/export?format=csv&gid=’)
  77.  
  78. #Get Pandas dataframe
  79. dataset = pd.read_csv(url)
  80. #print(dataset)
  81.  
  82. #Names = worksheet.col_values(2)[1:]
  83. #Names = data[var1]
  84. #print(Names)
  85.  
  86. df_names = dataset[var1]
  87. df_ages = dataset[var2]
  88. df_heights = dataset[var3]
  89. print(df_names)
  90.  
  91. #Preprocessing
  92. plots = dataset.groupby([var1], as_index=False).mean()
  93. #print(plots)
  94.  
  95. #Bar Plot in MatplotLib with plt.bar()
  96. #Names vs Age
  97. plt.figure(figsize=(10,5), tight_layout=True)
  98. colors = sns.color_palette('pastel')
  99. plt.bar(dataset[var1], dataset[var2], color=colors[:5])
  100. plt.xlabel(var1)
  101. plt.xticks(rotation=90)
  102. plt.ylabel('Age')
  103. plt.title('Barplot')
  104. plt.show()
  105.  
  106. #Name Vs Height
  107. plt.figure()
  108. plt.figure(figsize=(10,5), tight_layout=True)
  109. colors = sns.color_palette('deep')
  110. plt.bar(dataset[var1], dataset[var3], color=colors[:6])
  111. plt.xlabel(var1)
  112. plt.xticks(rotation=90)
  113. plt.ylabel('Height')
  114. plt.title('Barplot')
  115. plt.show()
  116.  
  117. #Bar Plot in Seaborn with sns.barplot()
  118. plt.figure(figsize=(10,5), tight_layout=True)
  119. ax = sns.barplot(x=dataset[var1], y=dataset[var2], palette='pastel', ci=None)
  120. ax.set(title='Barplot with Seaborn', xlabel='Names', ylabel='Age')
  121. plt.xticks(rotation=90)
  122. plt.show()
  123.  
  124. #Barplot grouped data by "n" variables
  125. plt.figure(figsize=(12, 6), tight_layout=True)
  126. ax = sns.barplot(x=dataset[var2], y=dataset[var3], hue=dataset[var1], palette='pastel')
  127. ax.set(title='Age vs Height' ,xlabel='Age', ylabel='Height')
  128. ax.legend(title='Names', title_fontsize='13', loc='upper right')
  129. plt.show()
  130.  
  131. #Histograms with plt.hist() or sns.histplot()
  132. plt.figure(figsize=(10,6), tight_layout=True)
  133. bins = [160, 165, 170, 175, 180, 185, 190, 195, 200]
  134. # matplotlib
  135. plt.hist(dataset[var3], bins=bins, color=sns.color_palette('Set2')[2], linewidth=2)
  136. plt.title('Histogram')
  137. plt.xlabel('Height (cm)')
  138. plt.ylabel('Count')
  139. # seaborn
  140. ax = sns.histplot(data=dataset, x=var3, bins=bins, color=sns.color_palette('Set2')[2], linewidth=2)
  141. ax.set(title='Histogram', xlabel='Height (cm)', ylabel='Count')
  142. plt.show()
  143.  
  144. #Boxplot
  145. plt.figure(figsize=(10,6), tight_layout=True)
  146. ax = sns.boxplot(data=dataset, x=var1, y=var2, palette='Set2', linewidth=2.5)
  147. ax.set(title='Boxplot', xlabel='Names', ylabel='Age (Years)')
  148. plt.xticks(rotation=90)
  149. plt.show()
  150.  
  151. #Scatter plot
  152. plt.figure(figsize=(10,6), tight_layout=True)
  153. ax = sns.scatterplot(data=dataset, x=var2, y=var3,   hue=var1, palette='Set2', s=60)
  154. ax.set(xlabel='Age (Years)', ylabel='Height (cm)')
  155. ax.legend(title='People', title_fontsize = 12)
  156. plt.show()
  157.  
  158. #Something else
  159. pivot = dataset.groupby([var1], as_index=False).mean()
  160. relationship = pivot.loc[:,var2:var3]
  161. print(relationship)
  162.  
  163. #Plot some graph
  164. charts = ["bar", "line", "barh", "hist", "box", "kde", "density", "area"]
  165. for chart_type in charts:
  166.     relationship.plot(kind="%s" % chart_type) #Replace bar with line, barh, hist, box, kde, density, area
  167.     plt.title("%s plot" % chart_type)
  168.     plt.show()
  169.  
  170. #Seaborn
  171. plt.figure()
  172. sns.set_style("darkgrid")
  173. sns.lineplot(data = dataset, x = var2, y = var3)
  174. plt.show()
  175.  
  176. plt.figure()
  177. sns.set_style("whitegrid")
  178. sns.lineplot(data = dataset, x = var2, y = var3)
  179. plt.show()
  180.  
  181. #Hexbin
  182. #Split the plotting window into 20 hexbins
  183. plt.figure()
  184. nbins = 20
  185. plt.title('Hexbin')
  186. plt.hexbin(dataset[var2], dataset[var3], gridsize=nbins, color=colors[:6])
  187. plt.show()
  188.  
  189. #2-D Hist
  190. plt.figure()
  191. plt.title('2-D Histogram')
  192. plt.hist2d(dataset[var2], dataset[var3], bins=nbins, color=colors[:5])
  193. plt.show()
  194.  
  195. #Set variables
  196. x = dataset[var2]
  197. y = dataset[var3]
  198. z = dataset[var1]
  199.  
  200. #Linear Regression
  201. plt.figure()
  202. sns.regplot(x = x, y = y, data=dataset);
  203. plt.show()
  204.  
  205. plt.figure()
  206. sns.jointplot(x=x, y=y, data=dataset, kind="reg");
  207. plt.show()
  208.  
  209. #Set seaborn style
  210. sns.set_style("white")
  211.  
  212. # Basic 2D density plot
  213. plt.figure()
  214. sns.kdeplot(x=x, y=y)
  215. plt.show()
  216.  
  217. # Custom the color, add shade and bandwidth
  218. plt.figure()
  219. sns.kdeplot(x=x, y=y, cmap="Reds", shade=True, bw_adjust=.5)
  220. plt.show()
  221.  
  222. # Add thresh parameter
  223. plt.figure()
  224. sns.kdeplot(x=x, y=y, cmap="Blues", shade=True, thresh=0)
  225. plt.show()
  226.  
  227. #Joint plot
  228. plt.figure()
  229. sns.jointplot(x = x,y = y,data = dataset,kind = 'hex')
  230. plt.show()

0
TSSFL -- A Creative Journey Towards Infinite Possibilities!
User avatar
Eli
Senior Expert Member
Reactions: 183
Posts: 5303
Joined: 9 years ago
Location: Tanzania
Has thanked: 75 times
Been thanked: 88 times
Contact:

#5

Get specific data of the ith asset in your KoBoToolbox account by creating a KoBo API request using a URL of the form https://kf.kobotoolbox.org/api/v2/assets/xxx/data.json where xxx is some string id with the API Connector:

0
TSSFL -- A Creative Journey Towards Infinite Possibilities!
User avatar
Eli
Senior Expert Member
Reactions: 183
Posts: 5303
Joined: 9 years ago
Location: Tanzania
Has thanked: 75 times
Been thanked: 88 times
Contact:

#6

We use koboextract Python module to extract the list of assests and data from KoBotoolbox, and convert kobo data into Pandas DataFrame. We can then process, analyze and visualize the data. This only requires to replace "KoBo_Token" with a valid KoBotoolbox API Key.

  1. from koboextractor import KoboExtractor #https://pypi.org/project/koboextractor/
  2. kobo = KoboExtractor("KoBo_Token", 'https://kf.kobotoolbox.org/api/v2', debug=True)
  3.  
  4. #Get the unique ID of the ith asset in your KoBoToolbox account:
  5. i = 1
  6. assets = kobo.list_assets()
  7. asset_uid = assets['results'][i]['uid']
  8.  
  9. #Get the list of assets
  10. asset = kobo.get_asset(asset_uid)
  11. #choice_lists = kobo.get_choices(asset)
  12. #questions = kobo.get_questions(asset=asset, unpack_multiples=True)
  13.  
  14. #Download all responses submitted after a certain point in time:
  15. #new_data = kobo.get_data(asset_uid, submitted_after='2020-05-20T17:29:30')
  16. #Get data
  17. new_data = kobo.get_data(asset_uid)
  18.  
  19. #new_data will be an unordered list of form submissions
  20. #We can sort this list by submission time by calling:
  21.  
  22. new_results = kobo.sort_results_by_time(new_data['results'])
  23. #print(new_results)
  24. import pandas as pd
  25. # Create DataFrame
  26. df = pd.DataFrame(new_results)
  27. print("DataFrame:", df.head())
  28. print("KoBo Data Column Names:", df.columns)

0
TSSFL -- A Creative Journey Towards Infinite Possibilities!
User avatar
Eli
Senior Expert Member
Reactions: 183
Posts: 5303
Joined: 9 years ago
Location: Tanzania
Has thanked: 75 times
Been thanked: 88 times
Contact:

#7

The code below does the following: Read data directly from KoBotoolbox, convert the data into pandas dataframe, select a subset of pandas dataframe and upload/paste it into the "imported_from_kobo" sheet of the Kobo_Android_Data Google Spreadsheet below, plot and visualize charts for a select dataset:

  1. import gspread
  2. import numpy as np
  3. from koboextractor import KoboExtractor #https://pypi.org/project/koboextractor/
  4. kobo = KoboExtractor("KoBo_Token", 'https://kf.kobotoolbox.org/api/v2', debug=True)
  5.  
  6. #Get the unique ID of the ith asset in your KoBoToolbox account:
  7. i = 1
  8. assets = kobo.list_assets()
  9. asset_uid = assets['results'][i]['uid']
  10.  
  11. #Get the list of assets
  12. asset = kobo.get_asset(asset_uid)
  13. #choice_lists = kobo.get_choices(asset)
  14. #questions = kobo.get_questions(asset=asset, unpack_multiples=True)
  15.  
  16. #Download all responses submitted after a certain point in time:
  17. #new_data = kobo.get_data(asset_uid, submitted_after='2020-05-20T17:29:30')
  18. #Get data
  19. new_data = kobo.get_data(asset_uid)
  20.  
  21. #new_data will be an unordered list of form submissions
  22. #We can sort this list by submission time by calling:
  23.  
  24. new_results = kobo.sort_results_by_time(new_data['results'])
  25. #print(new_results)
  26. import pandas as pd
  27. # Create DataFrame
  28. df = pd.DataFrame(new_results)
  29. print("DataFrame:", df.head())
  30. print("KoBo Data Column Names:", df.columns)
  31.  
  32. #Slice a dataframe
  33. dataset = df[["Name", "Age", "Height"]]
  34. #print("df:", df["Age"]=="40")
  35.  
  36. #Convert just columns "Age" and "Height" data types to numerics
  37. df[["Age", "Height"]] = df[["Age", "Height"]].apply(pd.to_numeric)
  38. print("df:", df["Age"]==40)
  39. print("df:", df)
  40.  
  41. #df = df.iloc[:, 2:9]
  42.  
  43. #Select multiple ranges of columns in Pandas DataFrame
  44. df = df.iloc[:, np.r_[2:9, 12:13, 14, 16]]
  45.  
  46. import urllib.request
  47. urllib.request.urlretrieve("https://www.dropbox.com/s/m728v370159b2xm/credentials.json?dl=1", "credentials.json")
  48.  
  49. gc = gspread.service_account(filename="credentials.json")
  50.  
  51. wb = gc.open_by_key("1GQk7WzixWPs7iyEwAEm63cMcW6dDX_MzEq0eUKucgEA")
  52.  
  53. new_sheet = wb.worksheet("imported_from_kobo")
  54.  
  55. new_sheet.update([df.columns.values.tolist()] + df.values.tolist())
  56.  
  57.  
  58. var1 = "Name"
  59. var2 = "Age"
  60. var3 = "Height"
  61.  
  62. #Run code
  63. #Plot some graph
  64. #Import required libraries
  65. import gspread
  66. import urllib.request
  67. import numpy as np
  68. import matplotlib.pyplot as plt
  69. import seaborn as sns
  70. import pandas as pd
  71.  
  72. age = df[var2]
  73. print("Ages:", age)
  74. height = df[var3]
  75. print("Heights:", height)
  76.  
  77. print("Average Age:", np.mean(age))
  78. print("Mean Height:", np.mean(height))
  79. print("Minimum and Maximum Age:", np.min(age), np.max(age))
  80. print("Minimum and Maximum Height:", np.min(height), np.max(height))
  81.  
  82. #Let's visualize
  83. #Graph styles and font size
  84. sns.set_style('darkgrid') # darkgrid, white grid, dark, white and ticks
  85. plt.rc('axes', titlesize=18)     # fontsize of the axes title
  86. plt.rc('axes', labelsize=14)    # fontsize of the x and y labels
  87. plt.rc('xtick', labelsize=13)    # fontsize of the tick labels
  88. plt.rc('ytick', labelsize=13)    # fontsize of the tick labels
  89. plt.rc('legend', fontsize=13)    # legend fontsize
  90. plt.rc('font', size=13)          # controls default text sizes
  91.  
  92. #sns list of color plettes
  93. #print(sns.color_palette('deep'), sns.color_palette("pastel"), sns.color_palette("Set2"))
  94.  
  95. df_names = df[var1]
  96. df_ages = df[var2]
  97. df_heights = df[var3]
  98. print(df_names)
  99.  
  100. #Preprocessing
  101. plots = df.groupby([var1], as_index=False).mean()
  102. #print(plots)
  103.  
  104. #Bar Plot in MatplotLib with plt.bar()
  105. #Names vs Age
  106. plt.figure(figsize=(10,5), tight_layout=True)
  107. colors = sns.color_palette('pastel')
  108. plt.bar(df[var1], df[var2], color=colors[:5])
  109. plt.xlabel(var1)
  110. plt.xticks(rotation=90)
  111. plt.ylabel('Age')
  112. plt.title('Barplot')
  113. plt.show()
  114.  
  115. #Name Vs Height
  116. plt.figure()
  117. plt.figure(figsize=(10,5), tight_layout=True)
  118. colors = sns.color_palette('deep')
  119. plt.bar(df[var1], df[var3], color=colors[:6])
  120. plt.xlabel(var1)
  121. plt.xticks(rotation=90)
  122. plt.ylabel('Height')
  123. plt.title('Barplot')
  124. plt.show()
  125.  
  126. print(df[var1], df[var2])
  127.  
  128. #Bar Plot in Seaborn with sns.barplot()
  129. plt.figure(figsize=(10,5), tight_layout=True)
  130. ax = sns.barplot(x=df[var1].astype("category"), y=df[var2].astype(np.float32), palette='pastel', ci=None)
  131. ax.set(title='Barplot with Seaborn', xlabel='Names', ylabel='Age')
  132. plt.xticks(rotation=90)
  133. plt.show()
  134.  
  135. #Barplot grouped data by "n" variables
  136. plt.figure(figsize=(12, 6), tight_layout=True)
  137. ax = sns.barplot(x=df[var2].astype(np.float32), y=df[var3].astype(np.float32), hue=df[var1].astype("category"), palette='pastel')
  138. ax.set(title='Age vs Height' ,xlabel='Age', ylabel='Height')
  139. ax.legend(title='Names', title_fontsize='13', loc='upper right')
  140. plt.show()
  141. plt.clf()
  142.  
  143. #Histograms with plt.hist() or sns.histplot()
  144. plt.figure(figsize=(10,6), tight_layout=True)
  145. bins = [160, 165, 170, 175, 180, 185, 190, 195, 200]
  146.  
  147. # matplotlib
  148. plt.hist(df[var3], bins=bins, color=sns.color_palette('Set2')[2], linewidth=2)
  149. plt.title('Histogram')
  150. plt.xlabel('Height (cm)')
  151. plt.ylabel('Count')
  152. # seaborn
  153. ax = sns.histplot(data=df, x=var3, bins=bins, color=sns.color_palette('Set2')[2], linewidth=2)
  154. ax.set(title='Histogram', xlabel='Height (cm)', ylabel='Count')
  155. plt.show()
  156. plt.clf()
  157.  
  158. #Boxplot
  159. plt.figure(figsize=(10,6), tight_layout=True)
  160. ax = sns.boxplot(data=df, x=var1, y=var2, palette='Set2', linewidth=2.5)
  161. ax.set(title='Boxplot', xlabel='Names', ylabel='Age (Years)')
  162. plt.xticks(rotation=90)
  163. plt.show()
  164.  
  165. #Scatter plot
  166. plt.figure(figsize=(10,6), tight_layout=True)
  167. ax = sns.scatterplot(data=df, x=var2, y=var3,   hue=var1, palette='Set2', s=60)
  168. ax.set(xlabel='Age (Years)', ylabel='Height (cm)')
  169. ax.legend(title='People', title_fontsize = 12)
  170. plt.show()
  171.  
  172. #Something else
  173. pivot = df.groupby([var1], as_index=False).mean()
  174. relationship = pivot.loc[:,var2:var3]
  175. print(relationship)
  176.  
  177. #Plot some graph
  178. charts = ["bar", "line", "barh", "hist", "box", "kde", "density", "area"]
  179. for chart_type in charts:
  180.     relationship.plot(kind="%s" % chart_type) #Replace bar with line, barh, hist, box, kde, density, area
  181.     plt.title("%s plot" % chart_type)
  182.     plt.show()
  183.  
  184. #Seaborn
  185. plt.figure()
  186. sns.set_style("darkgrid")
  187. sns.lineplot(data = df, x = var2, y = var3)
  188. plt.show()
  189.  
  190. plt.figure()
  191. sns.set_style("whitegrid")
  192. sns.lineplot(data = df, x = var2, y = var3)
  193. plt.show()
  194.  
  195. #Hexbin
  196. #Split the plotting window into 20 hexbins
  197. plt.figure()
  198. nbins = 20
  199. plt.title('Hexbin')
  200. plt.hexbin(df[var2], df[var3], gridsize=nbins, color=colors[:6])
  201. plt.show()
  202.  
  203. #2-D Hist
  204. plt.figure()
  205. plt.title('2-D Histogram')
  206. plt.hist2d(df[var2], df[var3], bins=nbins, color=colors[:5])
  207. plt.show()
  208.  
  209. #Set variables
  210. x = df[var2]
  211. y = df[var3]
  212. z = df[var1]
  213.  
  214. #Linear Regression
  215. plt.figure()
  216. sns.regplot(x = x, y = y, data=df);
  217. plt.show()
  218.  
  219. plt.figure()
  220. sns.jointplot(x=x, y=y, data=df, kind="reg");
  221. plt.show()
  222.  
  223. #Set seaborn style
  224. sns.set_style("white")
  225.  
  226. # Basic 2D density plot
  227. plt.figure()
  228. sns.kdeplot(x=x, y=y)
  229. plt.show()
  230.  
  231. # Custom the color, add shade and bandwidth
  232. plt.figure()
  233. sns.kdeplot(x=x, y=y, cmap="Reds", shade=True, bw_adjust=.5)
  234. plt.show()
  235.  
  236. # Add thresh parameter
  237. plt.figure()
  238. sns.kdeplot(x=x, y=y, cmap="Blues", shade=True, thresh=0)
  239. plt.show()
  240.  
  241. #Joint plot
  242. plt.figure()
  243. sns.jointplot(x = x,y = y,data = df,kind = 'hex')
  244. plt.show()

0
TSSFL -- A Creative Journey Towards Infinite Possibilities!
User avatar
Eli
Senior Expert Member
Reactions: 183
Posts: 5303
Joined: 9 years ago
Location: Tanzania
Has thanked: 75 times
Been thanked: 88 times
Contact:

#8

0
TSSFL -- A Creative Journey Towards Infinite Possibilities!
User avatar
Eli
Senior Expert Member
Reactions: 183
Posts: 5303
Joined: 9 years ago
Location: Tanzania
Has thanked: 75 times
Been thanked: 88 times
Contact:

#9

Mobile data collection using KoBo Collect and KoBoToolbox platform

TSSFL -- A Creative Journey Towards Infinite Possibilities!
User avatar
Eli
Senior Expert Member
Reactions: 183
Posts: 5303
Joined: 9 years ago
Location: Tanzania
Has thanked: 75 times
Been thanked: 88 times
Contact:

#10

TSSFL -- A Creative Journey Towards Infinite Possibilities!
Post Reply
  • Similar Topics
    Replies
    Views
    Last post

Return to “Technologies for Teaching, Learning, Research, Problem Solving and Business”

  • Information
  • Who is online

    Users browsing this forum: No registered users and 0 guests