Penalized likelihood and multi-objective spatial scans for the detection and inference of irregular clusters. Directory of Open Access Journals Sweden. Full Text Available Abstract Background Irregularly shaped spatial clusters are difficult to delineate. A cluster found by an algorithm often spreads through large portions of the map, impacting its geographical meaning. Penalized likelihood methods for Kulldorff's spatial scan statistics have been used to control the excessive freedom of the shape of clusters. Penalty functions based on cluster geometry and non-connectivity have been proposed recently.
|Published (Last):||9 October 2008|
|PDF File Size:||17.84 Mb|
|ePub File Size:||7.52 Mb|
|Price:||Free* [*Free Regsitration Required]|
Penalized likelihood and multi-objective spatial scans for the detection and inference of irregular clusters. Directory of Open Access Journals Sweden. Full Text Available Abstract Background Irregularly shaped spatial clusters are difficult to delineate. A cluster found by an algorithm often spreads through large portions of the map, impacting its geographical meaning.
Penalized likelihood methods for Kulldorff's spatial scan statistics have been used to control the excessive freedom of the shape of clusters.
Penalty functions based on cluster geometry and non-connectivity have been proposed recently. Another approach involves the use of a multi-objective algorithm to maximize two objectives: the spatial scan statistics and the geometric penalty function. A disconnection node is defined as a region within a cluster , such that its removal disconnects the cluster. By applying this function, the most geographically meaningful clusters are sifted through the immense set of possible irregularly shaped candidate cluster solutions.
To evaluate the statistical significance of solutions for multi-objective scans, a statistical approach based on the concept of attainment function is used. In this paper we compared different penalized likelihoods employing the geometric and non-connectivity regularity functions and the novel disconnection nodes cohesion function.
We also build multi-objective scans using those three functions and compare them with the previous penalized likelihood scans. An application is presented using comprehensive state-wide data for Chagas' disease in puerperal women in Minas Gerais state, Brazil. Conclusions We show that, compared to the other single-objective algorithms, multi-objective scans present better performance, regarding power, sensitivity and positive predicted value.
The multi-objective non-connectivity scan is faster and better suited for the. Full Text Available Chambers and Dorfman constructed bootstrap confidence intervals in model based estimation for finite population totals assuming that auxiliary values are available throughout a target population and that the auxiliary values are independent. They also assumed that the cluster sizes are known throughout the target population. We now extend to two stage sampling in which the cluster sizes are known only for the sampled clusters , and we therefore predict the unobserved part of the population total.
Jan and Elinor have done similar work, but unlike them, we use a general model, in which the auxiliary values are not necessarily independent.
We demonstrate that the asymptotic properties of our proposed estimator and its coverage rates are better than those constructed under the model assisted local polynomial regression model. Neuro-fuzzy system modeling based on automatic fuzzy clustering. A neuro-fuzzy system model based on automatic fuzzy clustering is proposed. A hybrid model identification algorithm is also developed to decide the model structure and model parameters. The algorithm mainly includes three parts:1 Automatic fuzzy C-means AFCM ,which is applied to generate fuzzy rules automatically,and then fix on the size of the neuro-fuzzy network,by which the complexity of system design is reducesd greatly at the price of the fitting capability;2 Recursive least square estimation RLSE.
It is used to update the parameters of Takagi-Sugeno model,which is employed to describe the behavior of the system;3 Gradient descent algorithm is also proposed for the fuzzy values according to the back propagation algorithm of neural network. Finally,modeling the dynamical equation of the two-link manipulator with the proposed approach is illustrated to validate the feasibility of the method.
Collaborative filtering recommendation model based on fuzzy clustering algorithm. As one of the most widely used algorithms in recommender systems, collaborative filtering algorithm faces two serious problems, which are the sparsity of data and poor recommendation effect in big data environment. In traditional clustering analysis, the object is strictly divided into several classes and the boundary of this division is very clear. However, for most objects in real life, there is no strict definition of their forms and attributes of their class.
Concerning the problems above, this paper proposes to improve the traditional collaborative filtering model through the hybrid optimization of implicit semantic algorithm and fuzzy clustering algorithm, meanwhile, cooperating with collaborative filtering algorithm.
In this paper, the fuzzy clustering algorithm is introduced to fuzzy clustering the information of project attribute, which makes the project belong to different project categories with different membership degrees, and increases the density of data, effectively reduces the sparsity of data, and solves the problem of low accuracy which is resulted from the inaccuracy of similarity calculation. Finally, this paper carries out empirical analysis on the MovieLens dataset, and compares it with the traditional user-based collaborative filtering algorithm.
The proposed algorithm has greatly improved the recommendation accuracy. A first packet processing subdomain cluster model based on SDN. For the current controller cluster packet processing performance bottlenecks and controller downtime problems. An SDN controller is proposed to allocate the priority of each device in the SDN Software Defined Network network, and the domain contains several network devices and Controller, the controller is responsible for managing the network equipment within the domain, the switch performs data delivery based on the load of the controller, processing network equipment data.
The experimental results show that the model can effectively solve the risk of single point failure of the controller, and can solve the performance bottleneck of the first packet processing.
Section 4 demonstrates our methodological enhancements in the context of several simulated and classical data sets, showcasing the use of particle learning methods for online anomaly detection, label generation, drift detection, and semi-supervised classification, none of which would be achievable through a standard MCMC approach.
Section 5 concludes with a discussion of future directions for research. Full Text Available Simultaneous clustering of rows and columns, usually designated by bi- clustering , coclustering or block clustering , is an important technique in two way data analysis.
A new standard and efficient approach has been recently proposed based on the latent block model Govaert and Nadif which takes into account the block clustering problem on both the individual and variable sets. This article presents our R package blockcluster for co- clustering of binary, contingency and continuous data based on these very models.
In this document, we will give a brief review of the model-based block clustering methods, and we will show how the R package blockcluster can be used for co- clustering. A user credit assessment model based on clustering ensemble for broadband network new media service supervision. This paper proposes a user credit assessment model based on clustering ensemble aiming to solve the problem that users illegally spread pirated and pornographic media contents within the user self-service oriented broadband network new media platforms.
Its idea is to do the new media user credit assessment by establishing indices system based on user credit behaviors, and the illegal users could be found according to the credit assessment results, thus to curb the bad videos and audios transmitted on the network. The user credit assessment model based on clustering ensemble proposed by this paper which integrates the advantages that swarm intelligence clustering is suitable for user credit behavior analysis and K-means clustering could eliminate the scattered users existed in the result of swarm intelligence clustering , thus to realize all the users' credit classification automatically.
The model's effective verification experiments are accomplished which are based on standard credit application dataset in UCI machine learning repository, and the statistical results of a comparative experiment with a single model of swarm intelligence clustering indicates this clustering ensemble model has a stronger creditworthiness distinguishing ability, especially in the aspect of predicting to find user clusters with the best credit and worst credit, which will facilitate the operators to take incentive measures or punitive measures accurately.
Besides, compared with the experimental results of Logistic regression based model under the same conditions, this clustering ensemble model is robustness and has better prediction accuracy. A model-based cluster analysis was applied to five indicators of maternal ER: maternal self-report, observed negative affect in a parent-child interaction, baseline respiratory sinus arrhythmia RSA , and RSA suppression across two laboratory tasks.
Model-based cluster analyses revealed four maternal ER profiles, including a group of mothers with average ER functioning, characterized by socioeconomic advantage and more positive parenting behavior.
A dysregulated cluster demonstrated the greatest challenges with parenting and dyadic interactions. Two clusters of intermediate dysregulation were also identified. Implications for assessment and applications to parenting interventions are discussed.
Aiming at the irregularity of nonlinear signal and its predicting difficulty, a deep learning prediction model based on extreme-point symmetric mode decomposition ESMD and clustering analysis is proposed.
Secondly, the fuzzy c-means is used to cluster the decomposed components, and then the deep belief network DBN is used to predict it.
Finally, the reconstructed A model-based clustering method to detect infectious disease transmission outbreaks from sequence variation. Full Text Available Clustering infections by genetic similarity is a popular technique for identifying potential outbreaks of infectious disease, in part because sequences are now routinely collected for clinical management of many infections.
A diverse number of nonparametric clustering methods have been developed for this purpose. These methods are generally intuitive, rapid to compute, and readily scale with large data sets. However, we have found that nonparametric clustering methods can be biased towards identifying clusters of diagnosis-where individuals are sampled sooner post-infection-rather than the clusters of rapid transmission that are meant to be potential foci for public health efforts.
We develop a fundamentally new approach to genetic clustering based on fitting a Markov-modulated Poisson process MMPP, which represents the evolution of transmission rates along the tree relating different infections. We evaluated this model-based method alongside five nonparametric clustering methods using both simulated and actual HIV sequence data sets.
Furthermore, the mean internal branch lengths that approximate transmission rates were significantly shorter in clusters extracted using MMPP, but not by other methods. We determined that the computing time for the MMPP method scaled linearly with the size of trees, requiring about 30 seconds for a tree of 1, tips and about 20 minutes for 50, tips on a single computer.
This new approach to genetic clustering has significant implications for the application of pathogen sequence analysis to public health, where. The nature of data in Astrophysics has changed, as in other scientific fields, in the past decades due to the increase of the measurement capabilities. As a consequence, data are nowadays frequently of high dimensionality and available in mass or stream.
Model-based techniques for clustering are popular tools which are renowned for their probabilistic foundations and their flexibility. However, classical model-based techniques show a disappointing behavior in high-dimensional spaces which is mainly due to their dramatical over-parametrization. This work presents a comprehensive review of these recent approaches, including regularization-based techniques, parsimonious modeling, subspace classification methods and classification methods based on variable selection.
The use of these model-based methods is also illustrated on real-world classification problems in Astrophysics using R packages.
This energy consumption issue becomes more challenging when the energy load is not properly distributed in the sensing area.
The hierarchal clustering architecture is the best choice for these kind of issues. In our proposed model, we introduce a simple strategy for the optimal number of cluster heads selection to overcome the problem of uneven energy distribution in the network. The attractiveness of our model is that the BS controls the number of cluster heads while the cluster heads control the cluster members in each cluster in such a restricted manner that a uniform and even load is ensured in each cluster.
We perform an extensive range of simulation using five quality measures, namely: the lifetime of the network, stable and unstable region in the lifetime of the network, throughput of the network, the number of cluster heads in the network, and the transmission time of the network to analyze the proposed model. In this article, we propose two classes of semiparametric mixture regression models with single-index for model based clustering. A common problem in many areas of applied statistics is to identify groups of similar time series in a panel of time series.
However, distance-based clustering methods cannot easily be extended to time series data, where an appropriate distance-measure is rather difficult to define, particularly for discrete-valued time series. This model-based clustering method is based on finite mixtures of first-order time-homogeneous Markov chain models. The parameters are estimated for a fixed number of clusters within a Bayesian framework using an Markov chain Monte Carlo MCMC sampling scheme representing a full Gibbs-type sampler which involves only draws from standard distributions.
Finally, an application to a panel of Austrian wage mobility data is presented which leads to an interesting segmentation of the Austrian labour market. A mixture model-based approach to the clustering of microarray expression data.
This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes.
The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space the number of genes is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model.
The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes.
However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes.
For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets. Full Text Available The energy consumption forecast is important for the decision-making of national economic and energy policies.
But it is a complex and uncertainty system problem affected by the outer environment and various uncertainty factors. Herein, a novel clustering model based on set pair analysis SPA was introduced to analyze and predict energy consumption.
Combined with indicator weights, group centroids of DRIs for influence factors were transferred into aggregating connection numbers in order to interpret uncertainty by identity-discrepancy-contrary IDC analysis. Moreover, a forecasting model based on similarity to group centroid was discussed to forecast energy consumption of a certain year on the basis of measured values of influence factors.
The results indicate that the method presented here is more feasible and easier to use and can interpret certainty and uncertainty of development speed of energy consumption and influence factors as a whole.
Mixture model-based clustering and logistic regression for automatic detection of microaneurysms in retinal images.
ISBN 13: 9786077971597
The Master's Degree in Procedural Law and Citizenship offers a series of subjects to their students, some of them mandatory and others, elective. All students have to complete, in a maximum of 24 months, the corresponding subject course loads, including the supervision period and preparation of the dissertation for defense. Coimbra: Almedina, A instrumentalidade do processo. Publicismo e privatismo no processo civil. Revista de processo, vol. Processo civil moderno.
Matter and method: from power to law and back again. The lesson learned from the classics. The Copernican revolution and the positivist method. Modern democracy. The Bobbian ideal: social democratic liberalism. Most of the time, they are formulas that transmit complex ideas simply, summarized and concisely. Above all, because the ability to untangle the knots of complexity does not presume simplifications or falsifications.
The Untouchables of Law. Apuntes para un debate. Carlos S. Nino" Prof. Albert Calsamiglia, Prof. Ricardo Caracciolo, Prof.