2.5 线形图
2.5.1 基本案例
线形图的绘制在散点图的绘制中提及过,用Plotly绘制线形图使用Scatter函数。如图2-10所示是线形图的简单实现,见文件2.5_LineChart_1.py。本案例使用Pandas生成时间序列作为横轴标签,对浦发银行2017年3月1日—2017年4月28日的股价涨跌幅进行了展现,数据来源是Wind数据库。
图2-10 基本线形图
# 2.5-1 基本案例 import plotly as py import plotly.graph_objs as go # Basic Line pyplt = py.offline.plot # 600000浦发银行20170301-20170428涨跌幅度数据,数据来源Wind profit_rate = [-0.001, -0.013, -0.004, 0.002, 0.003, -0.001, -0.009, 0.0, \ 0.007, -0.005, 0.0, 0.001, -0.006, -0.006, -0.009, -0.013, 0.005, 0.007, \ 0.004, -0.006, -0.009, -0.004, 0.015, 0.007, 0.001, 0.003, -0.009, \ -0.005, 0.001, -0.008, -0.016, 0.002, -0.013, -0.009, -0.014, 0.009, \ -0.003, 0.002, -0.001, 0.011, 0.004] date = pd.date_range(start = '3/1/2017', end = '4/30/2017') trace = [go.Scatter( x = date, y = profit_rate )] layout = dict( title = ’浦发银行20170301-20170428涨跌幅变化’, xaxis = dict(title = 'Date'), yaxis = dict(title = 'profit_rate') ) fig = dict(data = trace, layout = layout) pyplt(fig, filename='tmp/basic-line.html')
2.5.2 数据缺口与连接
在实际应用过程中,数据集往往并不完美,可能有缺失的数据,在Plotly中可以通过设置Scatter函数中的connectgaps属性来显示这些数据缺口或对缺口进行连接。如图2-11所示是在官方案例的基础上进行的调整,包含了多条线形图的绘制、线条样式设置,以及数据缺口保留与连接的控制,见文件2.5_LineChart_2.py。
图2-11 线形图缺失数据展示与连接
该案例的代码如下。
# 2.5-2 应用案例 # Average High and Low Temperatures in New York import plotly as py import plotly.graph_objs as go pyplt = py.offline.plot month = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'] # x 轴坐标 high_2000 = [32.5, 37.6, 49.9, 53.0, None, 75.4, 76.5, 76.6, 70.7, 60.6, 45.1, 29.3] low_2000 = [13.8, 22.3, 32.5, 37.2, None, 56.1, 57.7, 58.3, 51.2, 42.8, 31.6, 15.9] high_2007 = [36.5, 26.6, 43.6, 52.3, None, 81.4, 80.5, 82.2, 76.0, 67.3, 46.1, 35.0] low_2007 = [23.6, 14.0, 27.0, 36.8, None, 57.7, 58.9, 61.2, 53.3, 48.5, 31.0, 23.6] high_2014 = [28.8, 28.5, 37.0, 56.8, None, 79.7, 78.5, 77.8, 74.1, 62.6, 45.3, 39.9] low_2014 = [12.7, 14.3, 18.6, 35.5, None, 58.0, 60.0, 58.6, 51.7, 45.2, 32.2, 29.1] # 6组数据 # Create and style traces trace0 = go.Scatter( x = month, y = high_2014, name = 'High 2014', line = dict( color = ('rgb(205, 12, 24)'), width = 4), connectgaps = True ) trace1 = go.Scatter( x = month, y = low_2014, name = 'Low 2014', line = dict( color = ('rgb(22, 96, 167)'), width = 4, ), connectgaps = False ) trace2 = go.Scatter( x = month, y = high_2007, name = 'High 2007', line = dict( color = ('rgb(205, 12, 24)'), width = 4, dash = 'dash'), connectgaps = False ) # dash虚线(短线), dot虚线(点), dashdot trace3 = go.Scatter( x = month, y = low_2007, name = 'Low 2007', line = dict( color = ('rgb(22, 96, 167)'), width = 4, dash = 'dash'), connectgaps = False ) trace4 = go.Scatter( x = month, y = high_2000, name = 'High 2000', line = dict( color = ('rgb(205, 12, 24)'), width = 4, dash = 'dot'), connectgaps = False ) trace5 = go.Scatter( x = month, y = low_2000, name = 'Low 2000', line = dict( color = ('rgb(22, 96, 167)'), width = 4, dash = 'dot'), connectgaps = False ) data = [trace0, trace1, trace2, trace3, trace4, trace5] # Edit the layout layout = dict(title = 'Average High and Low Temperatures in New York', xaxis = dict(title = 'Month'), yaxis = dict(title = 'Temperature (degrees F)'), ) fig = dict(data=data, layout=layout) pyplt(fig, filename='tmp/styled-line.html')
在数据部分,原先的缺失数据被设置为None。在Scatter函数中,设置connectgaps属性为Fasle,表示不连接,显示数据缺口;设置connectgaps属性为True,表示连接缺失值左右相邻的数据点。在图2-11中,对“High 2014”线形图进行了连接,其他线条则采用显示缺口的形式。
Scatter函数中的line属性用于对线形图的样式进行控制;color用于设置颜色;width用于设置宽度;dash用于设置类型,dash表示由短线组成的虚线,dot表示由点组成的虚线,dashdot表示由点和短线组成的虚线。
2.5.3 数据插值
通过调整Scatter函数line属性中的shape值可以对插值的方法进行控制,完成数据点的插值设置。插值的方法简单来说就是根据已有的零散数据点,找到一条满足一定条件的曲线,使之经过全部的数据点。Plotly提供的插值方法有6种,分别是’linear'、'spline'、'hv'、'vh'、'hvh’和’vhv'。例如,设置shape='spline',表示通过三次样条方法对数据点进行插值。图2-12所示为官方案例,展示了6种不同的插值方法,见文件2.5_LineChart_3.py。
图2-12 不同插值方法的对比
该案例的代码如下。
# 2.5-3 应用案例 import plotly as py import plotly.graph_objs as go pyplt = py.offline.plot trace1 = go.Scatter( x=[1, 2, 3, 4, 5], y=[1, 3, 2, 3, 1], mode='lines+markers', name="'linear'", hoverinfo='name', line=dict( shape='linear' ) ) trace2 = go.Scatter( x=[1, 2, 3, 4, 5], y=[6, 8, 7, 8, 6], mode='lines+markers', name="'spline'", text=["tweak line smoothness<br>with 'smoothing' in line object"], hoverinfo='text+name', line=dict( shape='spline' ) ) trace3 = go.Scatter( x=[1, 2, 3, 4, 5], y=[11, 13, 12, 13, 11], mode='lines+markers', name="'vhv'", hoverinfo='name', line=dict( shape='vhv' ) ) trace4 = go.Scatter( x=[1, 2, 3, 4, 5], y=[16, 18, 17, 18, 16], mode='lines+markers', name="'hvh'", hoverinfo='name', line=dict( shape='hvh' ) ) trace5 = go.Scatter( x=[1, 2, 3, 4, 5], y=[21, 23, 22, 23, 21], mode='lines+markers', name="'vh'", hoverinfo='name', line=dict( shape='vh' ) ) trace6 = go.Scatter( x=[1, 2, 3, 4, 5], y=[26, 28, 27, 28, 26], mode='lines+markers', name="'hv'", hoverinfo='name', line=dict( shape='hv' ) ) data = [trace1, trace2, trace3, trace4, trace5, trace6] layout = dict( legend=dict( y=0.5, traceorder='reversed', font=dict( size=16 ) ) ) fig = dict(data=data, layout=layout) pyplt(fig, filename='tmp/line-shapes.html')
2.5.4 填充线形图
填充线形图是线形图的一种衍生,通过选择性地显示线条和对线条图进行填充来完成。如图2-13所示展示了恒宝股份、湘潭电化、大港股份的股票在一段时期内开盘的最高价与最低价,每条可见线条对应股票的开盘价,线条的上影线对应当天的最高价,线条的下影线对应当天的最低价,见文件2.5_LineChart_4.py。
图2-13 填充线形图
要绘制这样一个可视化图形,先把其拆成两部分,一部分是对三条可见线条(开盘价线条)进行绘制,另一部分是对三条填充线条进行绘制。下面这段代码完成了这个操作。
x = x + x_rev, y = y1_upper + y1_lower, fill = 'tozerox', fillcolor = 'rgba(0,0,205,0.2)', line = go.Line(color = 'transparent'),
首先,x + x_rev是从1到10、再从10到1的序列,y1_upper + y1_lower是从第1天的最高价至第10天的最高价、再从第10天的最高价至第1天的最高价的序列,注意这里的y1_lower已经在数据设置部分设置为逆序,由此可以得到两条线,通过对fill属性的设置,即可对两条线之间的部分进行颜色填充,最后设置line中的color属性为’transparent',对线条进行隐藏,运行结果如图2-13所示。
该案例的代码如下。
# 2.5-4 应用案例 import plotly as py import plotly.graph_objs as go pyplt = py.offline.plot x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] x_rev = x[::-1] # Line 1002104恒宝股份20170518-20170602 y1 = [8.86, 8.85, 8.69, 8.4, 8.62, 9, 8.99, 8.85, 8.59, 9.31] y1_upper = [9.05, 9.03, 9.08, 8.76, 8.63, 9.04, 9.09, 9.16, 8.9, 9.45] y1_lower = [8.86, 8.85, 8.64, 8.36, 8.33, 8.43, 8.93, 8.84, 8.53, 8.52] y1_lower = y1_lower[::-1] # 逆序 # Line 2002125湘潭电化20170518-20170602 y2 = [10.39, 10.35, 9.85, 9.73, 9.77, 9.8, 9.75, 9.65, 9.16, 9.34] y2_upper = [10.58, 10.52, 10.34, 10.14, 9.87, 9.87, 9.94, 9.6, 9.42, 9.5] y2_lower = [10.15, 10.21, 9.72, 9.68, 9.24, 9.48, 9.62, 9.12, 9.12, 9.34] y2_lower = y2_lower[::-1] # Line 3002077大港股份20170518-20170602 y3 = [11.88, 13.07, 12.75, 12.02, 12.1, 12.61, 12.42, 12.42, 11.18, 10.72] y3_upper = [11.98, 13.07, 13.4, 12.91, 12.45, 13.1, 12.61, 12.65, 12.45, 11.16] y3_lower = [11.6, 11.75, 12.75, 12.02, 11.8, 11.92, 12.17, 12.29, 11.18, 10.35] y3_lower = y3_lower[::-1] trace1 = go.Scatter( x = x + x_rev, y = y1_upper + y1_lower, fill = 'tozerox', fillcolor = 'rgba(0,0,205,0.2)', line = go.Line(color = 'transparent'), showlegend = False, name = ’恒宝股份’, ) trace2 = go.Scatter( x = x + x_rev, y = y2_upper + y2_lower, fill = 'tozerox', fillcolor = 'rgba(30,144,255,0.2)', line = go.Line(color = 'transparent'), name = ’湘潭电化’, showlegend = False, ) trace3 = go.Scatter( x = x+x_rev, y = y3_upper+y3_lower, fill = 'tozerox', fillcolor = 'rgba(112,128,144,0.2)', line = go.Line(color = 'transparent'), showlegend = False, name = ’大港股份’, ) trace4 = go.Scatter( x = x, y = y1, line = go.Line(color = 'rgb(0,0,205)'), mode = 'lines', name = ’恒宝股份’, ) trace5 = go.Scatter( x = x, y = y2, line = go.Line(color='rgb(30,144,255)'), mode = 'lines', name = ’湘潭电化’, ) trace6 = go.Scatter( x = x, y = y3, line = go.Line(color='rgb(112,128,144)'), mode = 'lines', name = ’大港股份’, ) data = go.Data([trace1, trace2, trace3, trace4, trace5, trace6]) layout = go.Layout( paper_bgcolor = 'rgb(255,255,255)', plot_bgcolor = 'rgb(229,229,229)', xaxis = go.XAxis( gridcolor = 'rgb(255,255,255)', range = [1,10], showgrid = True, showline = False, showticklabels = True, tickcolor = 'rgb(127,127,127)', ticks = 'outside', zeroline = False ), yaxis = go.YAxis( gridcolor = 'rgb(255,255,255)', showgrid = True, showline = False, showticklabels = True, tickcolor = 'rgb(127,127,127)', ticks = 'outside', zeroline = False ), ) fig = go.Figure(data = data, layout = layout) pyplt(fig, filename = 'tmp/shaded_lines.html')
2.5.5 应用案例
新闻来源统计线形图案例的运行结果如图2-14所示,代码见文件2.5_LineChart_5.py。
图2-14 新闻来源统计线形图
该案例的代码如下。
# 2.5-5 应用案例 import plotly as py import plotly.graph_objs as go pyplt = py.offline.plot title = 'Main Source for News' labels = ['Television', 'Newspaper', 'Internet', 'Radio'] colors = ['rgba(67,67,67,1)', 'rgba(115,115,115,1)', 'rgba(49,130,189, 1)', 'rgba(189,189,189,1)'] mode_size = [8, 8, 12, 8] line_size = [2, 2, 4, 2] x_data = [ [2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2013], [2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2013], [2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2013], [2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2013], ] y_data = [ [74, 82, 80, 74, 73, 72, 74, 70, 70, 66, 66, 69], [45, 42, 50, 46, 36, 36, 34, 35, 32, 31, 31, 28], [13, 14, 20, 24, 20, 24, 24, 40, 35, 41, 43, 50], [18, 21, 18, 21, 16, 14, 13, 18, 17, 16, 19, 23], ] traces = [] for i in range(0, 4): traces.append(go.Scatter( x = x_data[i], y = y_data[i], mode = 'lines', line = dict(color = colors[i], width = line_size[i]), connectgaps = True, )) traces.append(go.Scatter( x = [x_data[i][0], x_data[i][11]], y = [y_data[i][0], y_data[i][11]], mode = 'markers', marker = dict(color = colors[i], size = mode_size[i]) )) layout = go.Layout( xaxis = dict( showline = True, showgrid = False, showticklabels = True, # True为显示坐标标记 linecolor = 'rgb(204, 204, 204)', # x轴线的颜色 linewidth = 2, autotick = False, # True为自动删除部分日期,False为保持原状 ticks = 'outside', # x轴上的刻度线,在图内或图外 tickcolor = 'rgb(204, 204, 204)', # x轴上的刻度线的颜色 tickwidth = 2, # x轴上的刻度线的宽度 ticklen = 10, # x轴上的刻度线的长度 tickfont=dict( # x轴上的坐标标记的字体样式、大小、颜色 family = 'Arial', size = 12, color = 'rgb(82, 82, 82)', ), ), yaxis=dict( showgrid = False, zeroline = False, showline = False, showticklabels = False, ), autosize = False, margin = dict( autoexpand = False, l = 100, r = 20, t = 110, ), showlegend = False, ) annotations = [] # Adding labels for y_trace, label, color in zip(y_data, labels, colors): # labeling the left_side of the plot annotations.append(dict(xref = 'paper', x = 0.05, y = y_trace[0], xanchor = 'right', yanchor = 'middle', text = label + ' {}%'.format(y_trace[0]), font = dict(family = 'Arial', size = 16, color = colors, ), showarrow = False)) # labeling the right_side of the plot annotations.append(dict(xref = 'paper', x = 0.95, y = y_trace[11], xanchor = 'left', yanchor = 'middle', text = '{}%'.format(y_trace[11]), font = dict(family = 'Arial', size = 16, color = colors, ), showarrow = False)) # Title annotations.append(dict(xref = 'paper', yref = 'paper', x = 0.0, y = 1.05, xanchor = 'left', yanchor = 'bottom', text = 'Main Source for News', font = dict(family = 'Arial', size = 30, color = 'rgb(37,37,37)'), showarrow = False)) # Source annotations.append(dict(xref = 'paper', yref = 'paper', x = 0.5, y = -0.1, xanchor = 'center', yanchor = 'top', text = 'Source: PewResearch Center & ' + 'Storytelling with data', font = dict(family = 'Arial', size = 12, color = 'rgb(150,150,150)'), showarrow = False)) layout['annotations'] = annotations fig = go.Figure(data = traces, layout = layout) pyplt(fig, filename = 'tmp/news-source.html')
2.5.6 参数解读
由于线条图的绘制方法与散点图的绘制方法是一样的,都使用Scatter函数,所以它们的参数也是一样的,读者可以参考2.3.4节的相关内容。