Submit
Path:
~
/
/
proc
/
thread-self
/
root
/
opt
/
alt
/
python35
/
lib64
/
python3.5
/
site-packages
/
sklearn
/
cluster
/
__pycache__
/
File Content:
spectral.cpython-35.opt-1.pyc
��(XgH � @ s d Z d d l Z d d l Z d d l m Z m Z d d l m Z m Z d d l m Z d d l m Z d d l m Z d d l m Z d d l m Z d d l m Z d d d d d d � Z d d d d d d d d d � Z Gd d � d e e � Z d S)z"Algorithms for spectral clustering� N� )� BaseEstimator�ClusterMixin)�check_random_state�as_float_array)�check_array)�norm)�pairwise_kernels)�kneighbors_graph)�spectral_embedding� )�k_meansT� � c C s� d d l m } d d l m } t | � } t | d | �} t j t � j } | j \ } } t j | � } x� t | j d � D]� } | d d � | f t | d d � | f � | | d d � | f <| d | f d k r� d | d d � | f t j | d | f � | d d � | f <q� W| t j | d j d d � � d d � t j f } d } d } x"| | k r�| r�t j | | f � } | | j | � d d � f j | d d � d f <t j | � } xy t d | � D]h } | t j t j | | d d � | d f � � 7} | | j � d d � f j | d d � | f <q�Wd } d } x*| s�| d 7} t j | | � } | j d d � } | t j t | � � t j d | � | f f d | | f �} | j | } y) t j j | � \ } } } | d 7} Wn | k r(t d � PYn Xd | | j � } t | | � | k s_| | k rhd } q`| } t j | j | j � } q`WqlW| s�| d � � | S)a Search for a partition matrix (clustering) which is closest to the eigenvector embedding. Parameters ---------- vectors : array-like, shape: (n_samples, n_clusters) The embedding space of the samples. copy : boolean, optional, default: True Whether to copy vectors, or perform in-place normalization. max_svd_restarts : int, optional, default: 30 Maximum number of attempts to restart SVD if convergence fails n_iter_max : int, optional, default: 30 Maximum number of iterations to attempt in rotation and partition matrix search if machine precision convergence is not reached random_state: int seed, RandomState instance, or None (default) A pseudo random number generator used for the initialization of the of the rotation matrix Returns ------- labels : array of integers, shape: n_samples The labels of the clusters. References ---------- - Multiclass spectral clustering, 2003 Stella X. Yu, Jianbo Shi http://www1.icsi.berkeley.edu/~stellayu/publication/doc/2003kwayICCV.pdf Notes ----- The eigenvector embedding is used to iteratively search for the closest discrete partition. First, the eigenvector embedding is normalized to the space of partition matrices. An optimal discrete partition matrix closest to this normalized embedding multiplied by an initial rotation is calculated. Fixing this discrete partition matrix, an optimal rotation matrix is calculated. These two calculations are performed until convergence. The discrete partition matrix is returned as the clustering solution. Used in spectral clustering, this method tends to be faster and more robust to random initialization than k-means. r )� csc_matrix)�LinAlgError�copyr Nr ZaxisFg �shapez2SVD did not converge, randomizing and trying againg @TzSVD did not converge���)Zscipy.sparser Zscipy.linalgr r r �npZfinfo�float�epsr �sqrt�ranger Zsign�sumZnewaxisZzeros�randint�T�abs�dotZargminZargmaxZones�lenZarangeZlinalgZsvd�print)Zvectorsr Zmax_svd_restartsZ n_iter_max�random_stater r r Z n_samples�n_componentsZ norm_ones�iZsvd_restartsZ has_convergedZrotation�c�jZlast_objective_valueZn_iterZ t_discrete�labelsZvectors_discreteZt_svd�U�SZVhZ ncut_value� r) �/spectral.py� discretize s^ 4-E6233 - r+ � � g �kmeansc C s� | d k r t d | � � t | � } | d k r: | n | } t | d | d | d | d | d d �} | d k r� t | | d | d | �\ } } } n t | d | �} | S) a� Apply clustering to a projection to the normalized laplacian. In practice Spectral Clustering is very useful when the structure of the individual clusters is highly non-convex or more generally when a measure of the center and spread of the cluster is not a suitable description of the complete cluster. For instance when clusters are nested circles on the 2D plan. If affinity is the adjacency matrix of a graph, this method can be used to find normalized graph cuts. Read more in the :ref:`User Guide <spectral_clustering>`. Parameters ----------- affinity : array-like or sparse matrix, shape: (n_samples, n_samples) The affinity matrix describing the relationship of the samples to embed. **Must be symmetric**. Possible examples: - adjacency matrix of a graph, - heat kernel of the pairwise distance matrix of the samples, - symmetric k-nearest neighbours connectivity matrix of the samples. n_clusters : integer, optional Number of clusters to extract. n_components : integer, optional, default is n_clusters Number of eigen vectors to use for the spectral embedding eigen_solver : {None, 'arpack', 'lobpcg', or 'amg'} The eigenvalue decomposition strategy to use. AMG requires pyamg to be installed. It can be faster on very large, sparse problems, but may also lead to instabilities random_state : int seed, RandomState instance, or None (default) A pseudo random number generator used for the initialization of the lobpcg eigen vectors decomposition when eigen_solver == 'amg' and by the K-Means initialization. n_init : int, optional, default: 10 Number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia. eigen_tol : float, optional, default: 0.0 Stopping criterion for eigendecomposition of the Laplacian matrix when using arpack eigen_solver. assign_labels : {'kmeans', 'discretize'}, default: 'kmeans' The strategy to use to assign labels in the embedding space. There are two ways to assign labels after the laplacian embedding. k-means can be applied and is a popular choice. But it can also be sensitive to initialization. Discretization is another approach which is less sensitive to random initialization. See the 'Multiclass spectral clustering' paper referenced below for more details on the discretization approach. Returns ------- labels : array of integers, shape: n_samples The labels of the clusters. References ---------- - Normalized cuts and image segmentation, 2000 Jianbo Shi, Jitendra Malik http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.2324 - A Tutorial on Spectral Clustering, 2007 Ulrike von Luxburg http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.165.9323 - Multiclass spectral clustering, 2003 Stella X. Yu, Jianbo Shi http://www1.icsi.berkeley.edu/~stellayu/publication/doc/2003kwayICCV.pdf Notes ------ The graph should contain only one connect component, elsewhere the results make little sense. This algorithm solves the normalized cut for k=2: it is a normalized spectral clustering. r. r+ zTThe 'assign_labels' parameter should be 'kmeans' or 'discretize', but '%s' was givenNr"