Seaborn分布数据可视化---散点分布图

2022-09-04 来源：易榕旅网

散点分布图

综合表⽰散点图和直⽅分布图。

Jointplot()

绘制⼆变量或单变量的图形，底层是JointGrid()。

sns.jointplot( x, y,

data=None, kind='scatter', stat_func=None, color=None, height=6, ratio=5, space=0.2, dropna=True, xlim=None, ylim=None,

joint_kws=None, marginal_kws=None, annot_kws=None, **kwargs,)

Docstring:

Draw a plot of two variables with bivariate and univariate graphs.This function provides a convenient interface to the :class:`JointGrid`class, with several canned plot kinds. This is intended to be a fairlylightweight wrapper; if you need more flexibility, you should use:class:`JointGrid` directly.

Parameters----------x, y : strings or vectors

Data or names of variables in ``data``.data : DataFrame, optional

DataFrame when ``x`` and ``y`` are variable names.kind : { \"scatter\" | \"reg\" | \"resid\" | \"kde\" | \"hex\" }, optional Kind of plot to draw.

stat_func : callable or None, optional *Deprecated*

color : matplotlib color, optional Color used for the plot elements.height : numeric, optional

Size of the figure (it will be square).ratio : numeric, optional

Ratio of joint axes height to marginal axes height.space : numeric, optional

Space between the joint and marginal axesdropna : bool, optional

If True, remove observations that are missing from ``x`` and ``y``.{x, y}lim : two-tuples, optional

Axis limits to set before plotting.

{joint, marginal, annot}_kws : dicts, optional

Additional keyword arguments for the plot components.kwargs : key, value pairings

Additional keyword arguments are passed to the function used to draw the plot on the joint Axes, superseding items in the ``joint_kws`` dictionary.

Returns-------grid : :class:`JointGrid`

:class:`JointGrid` object with the plot on it.

See Also--------JointGrid : The Grid class used for drawing this plot. Use it directly if you need more flexibility.#综合散点分布图-jointplot

#创建DataFrame数组

rs = np.random.RandomState(3)

df = pd.DataFrame(rs.randn(200,2), columns=['A','B'])

#绘制综合散点分布图jointplot()

sns.jointplot(x=df['A'], y=df['B'], #设置x和y轴的数据 data=df, #设置数据 color='k',

s=50, edgecolor='w', linewidth=1, #散点⼤⼩、边缘线颜⾊和宽度（只针对scatter） kind='scatter', #默认类型：“scatter”，其他有“reg”、“resid”、“kde” space=0.2, #设置散点图和布局图的间距

height=8, #图表的⼤⼩（⾃动调整为正⽅形） ratio=5, #散点图与布局图⾼度⽐率

stat_func= sci.pearsonr, #pearson相关系数 marginal_kws=dict(bins=15, rug=True)) #边际图的参数

sns.jointplot(x=df['A'], y=df['B'], data=df, color='k',

kind='reg', #reg添加线性回归线 height=8, ratio=5,

stat_func= sci.pearsonr,

marginal_kws=dict(bins=15, rug=True))

sns.jointplot(x=df['A'], y=df['B'], data=df, color='k',

kind='resid', #resid height=8, ratio=5,

marginal_kws=dict(bins=15, rug=True))

sns.jointplot(x=df['A'], y=df['B'], data=df, color='k',

kind='kde', #kde密度图 height=8, ratio=5)

sns.jointplot(x=df['A'], y=df['B'], data=df, color='k',

kind='hex', #hex蜂窝图(六⾓形) height=8, ratio=5)

g = sns.jointplot(x=df['A'], y=df['B'], data=df, color='k',

kind='kde', #kde密度图 height=8, ratio=5,

shade_lowest=False)

#添加散点图(c-->颜⾊，s-->⼤⼩)

g.plot_joint(plt.scatter, c='w', s=10, linewidth=1, marker='+')

JointGrid()

创建图形⽹格，⽤于绘制⼆变量或单变量的图形，作⽤和Jointplot()⼀样，不过⽐Jointplot()更灵活。

sns.JointGrid( x, y,

data=None, height=6, ratio=5, space=0.2, dropna=True, xlim=None, ylim=None, size=None,)

Docstring: Grid for drawing a bivariate plot with marginal univariate plots.Init docstring:

Set up the grid of subplots.

Parameters----------x, y : strings or vectors

Data or names of variables in ``data``.

data : DataFrame, optional

DataFrame when ``x`` and ``y`` are variable names.height : numeric

Size of each side of the figure in inches (it will be square).ratio : numeric

Ratio of joint axes size to marginal axes height.space : numeric, optional

Space between the joint and marginal axesdropna : bool, optional

If True, remove observations that are missing from `x` and `y`.{x, y}lim : two-tuples, optional

Axis limits to set before plotting.

See Also--------jointplot : High-level interface for drawing bivariate plots with several different default plot kinds.

#设置风格

sns.set_style('white')#导⼊数据

tip_datas = sns.load_dataset('tips', data_home='seaborn-data')#绘制绘图⽹格，包含三部分：⼀个主绘图区域，两个边际绘图区域g = sns.JointGrid(x='total_bill', y='tip', data=tip_datas)#主绘图区域：散点图

g.plot_joint(plt.scatter, color='m', edgecolor='w', alpha=.3)#边际绘图区域：x和y轴

g.ax_marg_x.hist(tip_datas['total_bill'], color='b', alpha=.3)g.ax_marg_y.hist(tip_datas['tip'], color='r', alpha=.3, orientation='horizontal')#相关系数标签

from scipy import statsg.annotate(stats.pearsonr)#绘制表格线

plt.grid(linestyle='--')

g = sns.JointGrid(x='total_bill', y='tip', data=tip_datas)

g = g.plot_joint(plt.scatter, color='g', s=40, edgecolor='white')plt.grid(linestyle='--')

#两边边际图⽤统⼀函数设置统⼀风格

g.plot_marginals(sns.distplot, kde=True, color='g')

g = sns.JointGrid(x='total_bill', y='tip', data=tip_datas)#主绘图设置密度图

g = g.plot_joint(sns.kdeplot, cmap='Reds_r')plt.grid(linestyle='--')

#两边边际图⽤统⼀函数设置统⼀风格

g.plot_marginals(sns.distplot, kde=True, color='g')

Pairplot()

⽤于数据集的相关性图形绘制，如：矩阵图，底层是PairGrid()。

sns.pairplot( data,

hue=None,

hue_order=None, palette=None, vars=None, x_vars=None, y_vars=None, kind='scatter', diag_kind='auto', markers=None, height=2.5, aspect=1, dropna=True, plot_kws=None, diag_kws=None, grid_kws=None, size=None,)

Docstring:

Plot pairwise relationships in a dataset.

By default, this function will create a grid of Axes such that each

variable in ``data`` will by shared in the y-axis across a single row andin the x-axis across a single column. The diagonal Axes are treateddifferently, drawing a plot to show the univariate distribution of the datafor the variable in that column.

It is also possible to show a subset of variables or plot differentvariables on the rows and columns.

This is a high-level interface for :class:`PairGrid` that is intended to

make it easy to draw a few common styles. You should use :class:`PairGrid`directly if you need more flexibility.

Parameters----------data : DataFrame

Tidy (long-form) dataframe where each column is a variable and each row is an observation.

hue : string (variable name), optional

Variable in ``data`` to map plot aspects to different colors.hue_order : list of strings

Order for the levels of the hue variable in the palettepalette : dict or seaborn color palette

Set of colors for mapping the ``hue`` variable. If a dict, keys should be values in the ``hue`` variable.vars : list of variable names, optional

Variables within ``data`` to use, otherwise use every column with a numeric datatype.

{x, y}_vars : lists of variable names, optional

Variables within ``data`` to use separately for the rows and columns of the figure; i.e. to make a non-square plot.kind : {'scatter', 'reg'}, optional

Kind of plot for the non-identity relationships.diag_kind : {'auto', 'hist', 'kde'}, optional

Kind of plot for the diagonal subplots. The default depends on whether ``\"hue\"`` is used or not.

markers : single matplotlib marker code or list, optional

Either the marker to use for all datapoints or a list of markers with a length the same as the number of levels in the hue variable so that differently colored points will also have different scatterplot markers.

height : scalar, optional

Height (in inches) of each facet.aspect : scalar, optional

Aspect * height gives the width (in inches) of each facet.dropna : boolean, optional

Drop missing values from the data before plotting.{plot, diag, grid}_kws : dicts, optional Dictionaries of keyword arguments.

Returns-------grid : PairGrid

Returns the underlying ``PairGrid`` instance for further tweaking.See Also--------PairGrid : Subplot grid for more flexible plotting of pairwise relationships.

#导⼊鸢尾花数据

i_datas = sns.load_dataset('iris', data_home='seaborn-data')i_datas

#矩阵散点图

sns.pairplot(i_datas,

kind='scatter', #图形类型（散点图：scatter, 回归分布图：reg） diag_kind='hist', #对⾓线的图形类型（直⽅图：hist, 密度图：kde） hue='species', #按照某⼀字段分类 palette='husl', #设置调⾊板 markers=['o','s','D'], #设置点样式 height=2) #设置图标⼤⼩

#矩阵回归分析图sns.pairplot(i_datas,

kind='reg', #图形类型（散点图：scatter, 回归分布图：reg） diag_kind='kde', #对⾓线的图形类型（直⽅图：hist, 密度图：kde） hue='species', #按照某⼀字段分类 palette='husl', #设置调⾊板 markers=['o','s','D'], #设置点样式 height=2) #设置图标⼤⼩

#局部变量选择,vars

g = sns.pairplot(i_datas, vars=['sepal_width', 'sepal_length'], kind='reg', diag_kind='kde', hue='species', palette='husl')

#综合参数设置

sns.pairplot(i_datas, diag_kind='kde', markers='+', hue='species', #散点图的参数

plot_kws=dict(s=50, edgecolor='b', linewidth=1), #对⾓线图的参数

diag_kws=dict(shade=True))

PairGrid()

⽤于数据集的相关性图形绘制，如：矩阵图。功能⽐Pairplot()更加灵活。

sns.PairGrid( data,

hue=None,

hue_order=None, palette=None, hue_kws=None, vars=None, x_vars=None, y_vars=None,

diag_sharey=True, height=2.5, aspect=1, despine=True, dropna=True, size=None,)

Docstring:

Subplot grid for plotting pairwise relationships in a dataset.

This class maps each variable in a dataset onto a column and row in agrid of multiple axes. Different axes-level plotting functions can beused to draw bivariate plots in the upper and lower triangles, and thethe marginal distribution of each variable can be shown on the diagonal.It can also represent an additional level of conditionalization with the``hue`` parameter, which plots different subets of data in differentcolors. This uses color to resolve elements on a third dimension, butonly draws subsets on top of each other and will not tailor the ``hue``parameter for the specific visualization the way that axes-level functionsthat accept ``hue`` will.

See the :ref:`tutorial ` for more information.Init docstring:

Initialize the plot figure and PairGrid object.

Parameters----------data : DataFrame

Tidy (long-form) dataframe where each column is a variable and each row is an observation.

hue : string (variable name), optional

Variable in ``data`` to map plot aspects to different colors.hue_order : list of strings

Order for the levels of the hue variable in the palettepalette : dict or seaborn color palette

Set of colors for mapping the ``hue`` variable. If a dict, keys should be values in the ``hue`` variable.

hue_kws : dictionary of param -> list of values mapping

Other keyword arguments to insert into the plotting call to let other plot attributes vary across levels of the hue variable (e.g. the markers in a scatterplot).

vars : list of variable names, optional

Variables within ``data`` to use, otherwise use every column with a numeric datatype.

{x, y}_vars : lists of variable names, optional

Variables within ``data`` to use separately for the rows and columns of the figure; i.e. to make a non-square plot.height : scalar, optional

Height (in inches) of each facet.aspect : scalar, optional

Aspect * height gives the width (in inches) of each facet.despine : boolean, optional

Remove the top and right spines from the plots.dropna : boolean, optional

Drop missing values from the data before plotting.

See Also--------pairplot : Easily drawing common uses of :class:`PairGrid`.FacetGrid : Subplot grid for plotting conditional relationships.

#绘制四个参数vars的绘图⽹格(subplots)

g = sns.PairGrid(i_datas, hue='species', palette='hls',

vars=['sepal_length', 'sepal_width', 'petal_length', 'petal_width'])#对⾓线图形绘制g.map_diag(plt.hist,

histtype='step', #可选：'bar'\\ 'barstacked'\\'step'\\'stepfilled' linewidth=1)

#⾮对⾓线图形绘制

g.map_offdiag(plt.scatter, s=40, linewidth=1)#添加图例

g.add_legend()

g = sns.PairGrid(i_datas)#主对⾓线图形

g.map_diag(sns.kdeplot)#上三⾓图形

g.map_upper(plt.scatter)

#下三⾓图形

g.map_lower(sns.kdeplot, cmap='Blues_d')

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文

全部栏目

Seaborn分布数据可视化---散点分布图