【ATU Book-i.MX8系列 - Colab】 Colab 技巧应用(下)

一.   概述

本文将介绍一套非常好用的一个免费资源,由 Google 所提供的 Colab 这项云端服务。

延续上一章节的理念,此系列的目的为工具书导向,这里将持续介绍一些实用绘图的技巧应用,其应用除了 Python 常用 matplotlib 资料库之外,还导入 Altair、Plotly、Bokeh 等资料库来实现更多元的图表应用。而还能利用强大的 Altair 资料库来达成互动式的图表制作 !! 如下图文章架构图所示,此架构图隶属于 i.MX8M Plus 的方案博文中,并属于 Third Party 软体资源的 Google Colab 密技大公开 之部分,目前章节介绍 Colab 应用技巧(下)”。

 

若新读者欲理解更多人工智能、机器学习以及深度学习的资讯,可点选查阅下方博文
 大大通精彩博文   【ATU Book-i.MX8系列】博文索引

 

 

Colab 系列博文-文章架构示意图

 

二.  Google Colab 应用技巧

 1. 绘制图表 : 

Colab 提供常用的 Matplotlib 套件来展示图表数据。

下列附上 程式储存格的代码(灰底) 以及 运行结果(灰底后的图示),复制贴上至 Colab 即可使用 !!

(1) 折线图

 import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y1 = [1, 3, 5, 3, 1, 3, 5, 3, 1]
y2 = [2, 4, 6, 4, 2, 4, 6, 4, 2]
plt.plot(x, y1, label="line L")
plt.plot(x, y2, label="line H")
plt.plot()
plt.xlabel("x axis")
plt.ylabel("y axis")
plt.title("Line Graph Example")
plt.legend()
plt.show()

(2) 长条图

import matplotlib.pyplot as plt
# Look at index 4 and 6, which demonstrate overlapping cases.
x1 = [1, 3, 4, 5, 6, 7, 9]
y1 = [4, 7, 2, 4, 7, 8, 3]
x2 = [2, 4, 6, 8, 10]
y2 = [5, 6, 2, 6, 2]

# Colors: https://matplotlib.org/api/colors_api.html
plt.bar(x1, y1, label="Blue Bar", color='b')
plt.bar(x2, y2, label="Green Bar", color='g')
plt.plot()
plt.xlabel("bar number")
plt.ylabel("bar height")
plt.title("Bar Chart Example")
plt.legend()

 

(3) 直方图

mport matplotlib.pyplot as plt
import numpy as np
# Use numpy to generate a bunch of random data in a bell curve around 5.
n = 5 + np.random.randn(1000)
m = [m for m in range(len(n))]
# Histogram
plt.hist(n, bins=20)
plt.title("Histogram")
plt.show()
plt.hist(n, cumulative=True, bins=20)
plt.title("Cumulative Histogram")
plt.show()

 

(4) 散点图

import matplotlib.pyplot as plt
x1 = [2, 3, 4]
y1 = [5, 5, 5]
x2 = [1, 2, 3, 4, 5]
y2 = [2, 3, 2, 3, 4]
y3 = [6, 8, 7, 8, 7]
# Markers: https://matplotlib.org/api/markers_api.html
plt.scatter(x1, y1)
plt.scatter(x2, y2, marker='v', color='r')
plt.scatter(x2, y3, marker='^', color='m')
plt.title('Scatter Plot Example')
plt.show()

 

(5) 堆叠折线图

import matplotlib.pyplot as plt
idxes = [ 1, 2, 3, 4, 5, 6, 7, 8, 9]
arr1 = [23, 40, 28, 43, 8, 44, 43, 18, 17]
arr2 = [17, 30, 22, 14, 17, 17, 29, 22, 30]
arr3 = [15, 31, 18, 22, 18, 19, 13, 32, 39]

# Adding legend for stack plots is tricky.
plt.plot([], [], color='r', label = 'D 1')
plt.plot([], [], color='g', label = 'D 2')
plt.plot([], [], color='b', label = 'D 3')
plt.stackplot(idxes, arr1, arr2, arr3, colors= ['r', 'g', 'b'])
plt.title('Stack Plot Example')
plt.legend()
plt.show()

 

(6) 圆饼图

import matplotlib.pyplot as plt
labels = 'S1', 'S2', 'S3'
sections = [56, 66, 24]
colors = ['c', 'g', 'y']
plt.pie(sections, labels=labels, colors=colors, startangle=90, explode = (0, 0.1, 0), autopct = '%1.2f%%')
plt.axis('equal') # Try commenting this out.
plt.title('Pie Chart Example')
plt.show()

 

(7) 子图表应用

import matplotlib.pyplot as plt
import numpy as np

def random_plots():
xs = []
ys = []
for i in range(20):
x = i
y = np.random.randint(10)
xs.append(x)
ys.append(y)
return xs, ys

fig = plt.figure()
ax1 = plt.subplot2grid((5, 2), (0, 0), rowspan=1, colspan=2)
ax2 = plt.subplot2grid((5, 2), (1, 0), rowspan=3, colspan=2)
ax3 = plt.subplot2grid((5, 2), (4, 0), rowspan=1, colspan=1)
ax4 = plt.subplot2grid((5, 2), (4, 1), rowspan=1, colspan=1)

x, y = random_plots()
ax1.plot(x, y)

x, y = random_plots()
ax2.plot(x, y)

x, y = random_plots()
ax3.plot(x, y)

x, y = random_plots()
ax4.plot(x, y)

plt.tight_layout()
plt.show())

 

(8) 3D 散点图

import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d import axes3d
fig = plt.figure()
ax = fig.add_subplot(111, projection = '3d')
x1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y1 = np.random.randint(10, size=10)
z1 = np.random.randint(10, size=10)
x2 = [-1, -2, -3, -4, -5, -6, -7, -8, -9, -10]
y2 = np.random.randint(-10, 0, size=10)
z2 = np.random.randint(10, size=10)
ax.scatter(x1, y1, z1, c='b', marker='o', label='blue')
ax.scatter(x2, y2, z2, c='g', marker='D', label='green')
ax.set_xlabel('x axis'), ax.set_ylabel('y axis'), ax.set_zlabel('z axis')
plt.title("3D Scatter Plot Example")
plt.legend(), plt.tight_layout(), plt.show()

 

(9) 3D 直线图

import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(111, projection = '3d')
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y = np.random.randint(10, size=10)
z = np.zeros(10)
dx = np.ones(10)
dy = np.ones(10)
dz = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
ax.bar3d(x, y, z, dx, dy, dz, color='g')
ax.set_xlabel('x axis'), ax.set_ylabel('y axis'), ax.set_zlabel('z axis')
plt.title("3D Bar Chart Example")
plt.tight_layout(), plt.show()

 

(10) 3D 线框图

import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111, projection = '3d')
x, y, z = axes3d.get_test_data()
ax.plot_wireframe(x, y, z, rstride = 2, cstride = 2)
plt.title("Wireframe Plot Example")
plt.tight_layout()
plt.show()

 

(11) 额外套件 - Altair

import altair as alt
from vega_datasets import data
cars = data.cars()
alt.Chart(cars).mark_point().encode(x='Horsepower',y='Miles_per_Gallon', color='Origin',).interactive()

 

官方网站 : https://altair-viz.github.io/

(12) 额外套件 - Plotly

from plotly.offline import iplot

import plotly.graph_objs as go
data = [go.Contour( z=[[10, 10.625, 12.5, 15.625, 20], [5.625, 6.25, 8.125, 11.25, 15.625],
[2.5, 3.125, 5., 8.125, 12.5],[0.625, 1.25, 3.125, 6.25, 10.625], [0, 0.625, 2.5, 5.625, 10]] )]
iplot(data)

 

官方网站 : https://plotly.com/python/

(13) 额外套件 - Bokeh

import numpy as np
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
# Call once to configure Bokeh to display plots inline in the notebook.
output_notebook()
N = 4000
x = np.random.random(size=N) * 100
y = np.random.random(size=N) * 100
radii = np.random.random(size=N) * 1.5
colors = ["#%02x%02x%02x" % (r, g, 150) for r, g in zip(np.floor(50+2*x).astype(int), np.floor(30+2*y).astype(int))]
p = figure()
p.circle(x, y, radius=radii, fill_color=colors, fill_alpha=0.6, line_color=None)
show(p)

 

官方网站 : https://bokeh.org/

 

  

 2. 互动式套件 - 表格操作 : 

Colab 有额外的 pandas 互动式显示之扩充套件,能够动态滤除、排序、检索等数据。

(1) Panda 数据表格展示

from google.colab import data_table
from vega_datasets import data
data_table.enable_dataframe_formatter()
data.airports()

 

PS : 使用 “data_table.disable_dataframe_formatter()” 则会恢复成原生的 panda 显示方式。

(2) Panda 数据表格展示 (自行设定)

from google.colab import data_table
from vega_datasets import data
data_table.DataTable(data.airports(), include_index=False, num_rows_per_page=5)

 

 

 

 3. 互动式套件 – 图表操作 : 

Colab 亦有提供强大的互动式图表套件 Altair,以精美的图表呈现数据。

(1) 散点图

# load an example dataset
from vega_datasets import data
cars = data.cars()

# plot the dataset, referencing dataframe column names
import altair as alt
alt.Chart(cars).mark_point().encode( x='Horsepower', y='Miles_per_Gallon', color='Origin' ).interactive()

 

(2) 互动式散点图

# load an example dataset

from vega_datasets import data
cars = data.cars()

import altair as alt
points = alt.Chart(cars).mark_point().encode( x='Year:T', y='Miles_per_Gallon', color='Origin' ).properties( width=800 )
lines = alt.Chart(cars).mark_line().encode( x='Year:T', y='mean(Miles_per_Gallon)', color='Origin').properties( width=800).interactive(bind_y=False)
points + lines

 

PS : 此图可以用鼠标滚动来缩放检视数值范围。

(3) 长条图

# load an example dataset
from vega_datasets import data
cars = data.cars()

# plot the dataset, referencing dataframe column names
import altair as alt
alt.Chart(cars).mark_bar().encode( x='mean(Miles_per_Gallon)', y='Origin', color='Origin' )

 

(4) 直方图

# load an example dataset
from vega_datasets import data
cars = data.cars()

# plot the dataset, referencing dataframe column names
import altair as alt
alt.Chart(cars).mark_bar().encode( x=alt.X('Miles_per_Gallon', bin=True), y='count()',)

 

(5) 推叠直方图

# load an example dataset
from vega_datasets import data
cars = data.cars()

# plot the dataset, referencing dataframe column names
import altair as alt
alt.Chart(cars).mark_bar().encode( x=alt.X('Miles_per_Gallon', bin=True), y='count()', color='Origin' )

 

(6) 图表应用-结合相同类型图表

# load an example dataset
from vega_datasets import data
cars = data.cars()

# plot the dataset, referencing dataframe column names
import altair as alt
interval = alt.selection_interval()
base = alt.Chart(cars).mark_point().encode( y='Miles_per_Gallon', color=alt.condition(interval, 'Origin', alt.value('lightgray'))).properties(selection=interval)
base.encode(x='Acceleration') | base.encode(x='Horsepower')

 

(7) 图表应用-结合不同类型图表

# load an example dataset
from vega_datasets import data
cars = data.cars()

# plot the dataset, referencing dataframe column names
import altair as alt
interval = alt.selection_interval()
points = alt.Chart(cars).mark_point().encode(x='Horsepower',y='Miles_per_Gallon',color=alt.condition(interval, 'Origin', alt.value('lightgray'))).properties(selection=interval)
histogram = alt.Chart(cars).mark_bar().encode(x='count()',y='Origin',color='Origin').transform_filter(interval)
points & histogram

 

 

 

三.  结语

本文主要目的是推广 Colab 的实用性为主,其用意是希望读者可以将此系列博文当作一套工具书来查阅,来达到快速应用之目的。本文介绍一系列 matplotlib 绘图方式,其中比较值得关注的是互动式图表套件 Altair,能此套件或资料库来实现互动式操作,透过滚动鼠标的方式来检视图表 !! 相当精美 !! 后续文章,将说明在 Colab 平台上该如何配合 Google Drive 、 Google Sheet 、 GitHub 来做到更多元的应用服务,敬请期待 !!

 

四.  参考文件

[1] 官方文件 - Colaboratory 官网
[2] 第三方文件 -鸟哥的首页

如有任何相关 Colab 技术问题,欢迎至博文底下留言提问 !!
接下来还会分享更多 Colab 的技术文章 !!敬请期待 【ATU Book-i.MX8 系列 - Colab 
 !!

 

★博文内容均由个人提供,与平台无关,如有违法或侵权,请与网站管理员联系。

★文明上网,请理性发言。内容一周内被举报5次,发文人进小黑屋喔~

评论