Skip to main content
Filter by
Sorted by
Tagged with
0 votes
0 answers
63 views

I have a set of binarized images containing forms, each image follows one of N layouts. There are a few outliers which do not follow a layout and contain random text and images. The distance between ...
sebastian's user avatar
  • 1,818
0 votes
0 answers
73 views

I am using FlowSOM() Clustering from the FlowSOM and am getting an error while a vectorized function is running: Error in map2(): ℹ In index: 8. ℹ With name: FileID8. Caused by error in map() at ...
Mikey's user avatar
  • 9
0 votes
0 answers
34 views

I need some help with a fairly complex task I’ve been assigned: document reconciliation between different types of records. In short, I have to match documents with different “causal codes”: 2: Goods ...
H3doX's user avatar
  • 1
0 votes
0 answers
52 views

I've been using an ordered stereotype (OSM) approach to ordinal clustering with the R library 'clustord' clustord is very well-documented with step-by step vignette. Therefore, to execute row ...
EB3112's user avatar
  • 339
0 votes
0 answers
47 views

I have been working with mclust, and have encountered issues that I can't find an obvious reason for. My main concern is that the threshold for multiple components to be found seems really high, and I ...
atelopus's user avatar
0 votes
0 answers
31 views

Question GridDB Container Partition Recovery After Node Failure I'm working with a 3-node GridDB cluster and need to implement automatic recovery logic when one node fails. My application creates ...
Muhammad Saleem's user avatar
0 votes
0 answers
57 views

I have had cluster plots produced for some RNA Seq time course data using the LRT analysis. I believe the plots are produced using the command: clusters <- degPatterns(cluster_rlog, metadata = meta,...
Rob Staruch's user avatar
5 votes
3 answers
246 views

I'm trying to group rows that have values within specific error/tolerance. Input looks like this: input <- data.frame(Row_number = 1:22, Name = c(rep("A",6), rep("...
Jennifer's user avatar
  • 317
0 votes
0 answers
42 views

I am working with matched case-control data that used risk-set sampling with replacement (a control can be matched to more than one case). I am trying to figure out the correct syntax for conditional ...
user28632583's user avatar
2 votes
1 answer
159 views

Context: I have a 2D array (size N x M), let's call it U, where each cell contains a non-negative value K ≥ 0 representing a "density" at that point. I want to algorithmically divide the ...
JC Denton's user avatar
0 votes
1 answer
46 views

I'm hoping to get some advice on approaching a clustering problem. I have two separate spatial datasets, being real data and modelled data. The real data contains a binary output (0,1), which is ...
jonboy's user avatar
  • 392
1 vote
1 answer
126 views

I'm using Google RO API to create clusters. There is a capacity constraint on the clusters and the clusters should not overlap with each other. To do this, I've set the load demand of each shipment to ...
Darsh Patel's user avatar
0 votes
1 answer
177 views

I need help regarding dragonfly db, particularly benchmarking. So here is the story, I tried benchmarking dragonfly as a cache to replace redis. I got the expected result when testing single node; it ...
amzshow's user avatar
  • 58
3 votes
5 answers
154 views

I need to combine interconnected list elements to form distinct elements in base R with no additional packages required (while removing NA and zero-length elements). Edit: I look for flexibility of ...
Peter's user avatar
  • 2,473
1 vote
1 answer
163 views

Fixed sized clusters I need help with a capacitated clustering task. I have 400 locations (the number can vary each time), and I need to create fixed-size clusters (e.g., 40 locations per cluster). ...
Darsh Patel's user avatar
0 votes
0 answers
31 views

I'm trying to cluster values from a map in Python (these values could be income, kindness towards dogs or amount of penguins in supermarkets, for me the values are floats) from different data sources. ...
Auke Van Der Woude's user avatar
0 votes
0 answers
58 views

I performed HDBSCAN Clustering hdbscan_clusterer = hdbscan.HDBSCAN(min_cluster_size=200) df['Cluster'] = hdbscan_clusterer.fit_predict(data_matrix_for_clustering) Now, I’m interested in getting the ...
name0's user avatar
  • 1
0 votes
0 answers
48 views

Initially, I performed kmeans clustering and obtained some meaningful clusters. To refine these clusters, I ran Fuzzy C Means on the Kmeans center using "e1071" package. Are there any ...
Mary's user avatar
  • 221
2 votes
1 answer
186 views

Little intro I have data (link at the bottom), with on the y-axis the score, x-axis the position, for different labels. Now I want to know if there is one label that is "significantly" ...
CodeNoob's user avatar
  • 1,840
1 vote
2 answers
95 views

I have a list of paired values. Values in each pair are declared as similar, meaning two values are considered similar if they appear together in a pair from the list. My goal is to create a list of ...
Ömer Faruk Güllüoğlu's user avatar
0 votes
0 answers
39 views

I'm using GeoJSONSource to show images on the map (like images on the map in Apple Photos). Those images are loaded from the FeatureCollection object and first thing I do is to add them to map style. ...
Alzemic's user avatar
  • 11
0 votes
0 answers
84 views

I am trying to fit SEM in lavaan that includes both a measurement and structural model. The measurement model consists of six latent variables, which serve as outcomes in the structural model. The ...
Quy Pham's user avatar
2 votes
0 answers
78 views

I am implementing Fuzzy C-means to work with image segmentation following the given algorithm : However when updating the centroids (this is the first thing that I do) all clusters centers converge ...
Albert4224's user avatar
1 vote
0 answers
27 views

I'm trying to run a purely spatial analysis using SaTScan in R, but I'm getting date-related errors even though I'm not using any temporal data. Here's the error: Error: Invalid date '775' in ...
Sarahk's user avatar
  • 11
1 vote
1 answer
75 views

I have 54399 cases, and 2 channels (HOM and HOS), and I want to use multichannel sequence analysis, the data example is as follows: ID HOM1 HOM2 HOM3 HOM4 HOS1 HOS2 HOS3 HOS4 1 A A B C NO YES NO NO 2 ...
Fanny0000's user avatar
1 vote
0 answers
49 views

I am using priority queue to do the hierarchical clustering(can not import heapq), and want to use the complete-link method, but I don't know what is the problem of my code, the reason is far from ...
吳思覦's user avatar
-1 votes
1 answer
184 views

from sklearn.cluster import KMeans cs = [] for i in range(1, 11): kmeans = KMeans(n_clusters = i, init = 'k-means++', max_iter = 300, n_init = 10, random_state = 0) kmeans.fit(X) cs.append(...
Niubie's user avatar
  • 1
0 votes
1 answer
104 views

I have a directed graph where there are importance or weight attributes for both the nodes and edges. I am looking for a community or module detection implementation in python that will consider both ...
abbas786's user avatar
  • 401
1 vote
1 answer
211 views

I'm using k-means for my project for the first time. my dataset has more than 400,000 rows and 11 columns, I run the k-means for k= 3, 5, 7, 9, and 10. it took more than 65 minutes and still no output....
Joud's user avatar
  • 7
4 votes
3 answers
157 views

In R, I have the following dataframe with the column "overlap" listing rows that have overlapping values on some other column. df <- data.frame(overlap = c("1,2,3", "1,2,3&...
bcrew's user avatar
  • 105
0 votes
0 answers
181 views

I have a large dataset (2 million rows, 100 columns), and I need to perform clusterization. I used the elbow method to determine the optimal number of clusters. However, in order to get a more refined ...
AbliusKarfax's user avatar
0 votes
1 answer
87 views

I'm trying to compute the DBCV metric (provided by "git+https://github.com/FelSiq/DBCV") on density-based clusters from a dataset similar to the one shown here: The calculation is performed ...
giuseppe sabino's user avatar
1 vote
0 answers
55 views

I have a series of text utterances in summary form (form of sentences). I am trying to perform clustering and group them with similarity in context (not in literal meaning) and report the clusters ...
eashwar natarajan's user avatar
0 votes
1 answer
60 views

I'm trying to create a model in CPLEX OPL Studio for clustering with an additive criterion, but I have a number of errors that I don't know how to fix correctly, because I'm very bad at OPL Studio ...
Zraf Ker's user avatar
-2 votes
1 answer
40 views

I have a grid with many interconnections. The grid consists of edges of different length. I would like to cluster this grid into segments of similar length. The edges which are summarized in a cluster ...
Matthias's user avatar
1 vote
0 answers
45 views

multiple no. of self propelled rods (modelled using odd number of connected hard spheres ) with a fixed self propelled velocity is moving in a medium (2D) with three different diffusion constants for ...
Anonymous One's user avatar
0 votes
1 answer
138 views

I'm working on a clustering problem and I would like to use the hclust functions to create the dendrogram and cutreeDynamic to create clusters from the mentioned dendrogram. In fact, I have already ...
José Adrián Pardo Pérez's user avatar
0 votes
1 answer
177 views

I have computed the dissimilarity matrix using vegdist() function, and method specified as "morisita". However, even though hclust() function is built to read both distance or dissimilarity ...
Sukhraj Kaur 1910115's user avatar
3 votes
1 answer
59 views

I want to detecting the three rectangles(white, gray, black) in this picture, like below image. I tried to use find_contour function in opencv for Python, but the light gray stripes disturbed find ...
lksj's user avatar
  • 129
1 vote
2 answers
152 views

I have longitudinal data as follows: import pandas as pd # Define the updated data with samples only in 'sample_A' or 'sample_B' data = { 'gene_id': ['gene_1', 'gene_1', 'gene_1', 'gene_1', '...
donkey's user avatar
  • 1,458
1 vote
1 answer
107 views

I want each cluster to have a maximum of 20 items. Here is my code in PostgreSQL with PostGIS extension: WITH RECURSIVE clustered_data AS (-- Step 1: Perform initial clustering SELECT pma.* ...
Ray92's user avatar
  • 463
4 votes
1 answer
510 views

I've been working on a topic modelling project using BERTopic 0.16.3, and the preliminary results were promising. However, as the project progressed and the requirements became apparent, I ran into a ...
Bbrk24's user avatar
  • 1,053
0 votes
1 answer
545 views

I am trying the T-SNE method to explore high-dimensional datasets and reduce its dimensionality. And I have ended up with the following plot. I have used the TSNE parameters n_components=2 and init='...
skan's user avatar
  • 7,790
0 votes
1 answer
56 views

I use Spring Boot 3.x and an external Tomcat 10. Set up session clustering on an external Tomcat If I check on the jsp page, the session is shared, but If I check the same logic with spring boot ...
watercolor's user avatar
-1 votes
1 answer
134 views

I want to make a classifier for text, which is further use to suggest the most similar text for a one given. The flow of the app is the following: extract the main 10 topics from the text, using a ...
will's user avatar
  • 161
0 votes
1 answer
153 views

I have this dataset: > dput(mdata2) structure(list(EE = c(3.3221428469822, 3.62699732299098, 1.75430154205983, 0.809228977410138, 1.24117055233438, 2.93403148663873, 4.01630566539058, 1....
Fabrizio's user avatar
  • 947
0 votes
1 answer
257 views

I have a set of sentences which I have transformed into vectors using SBERT embedding. I would like to cluster these vectors. When looking for informations online, I keep seeing post telling to do ...
Alex Jax's user avatar
2 votes
1 answer
96 views

I want to do the same as asked here, using the first approach from the question. Sadly, the mods variable from the following line is not defined and I'm asking my self how to adjust: g2 <- delete....
Sulz's user avatar
  • 523
0 votes
0 answers
47 views

I have defined specific function in my project of re-implementing the kmeans algorithm but at the point where the centroids are ment to be re-assigned and obtain newer values, they come out as NONE. ...
KIZ-MAN's user avatar
  • 33
0 votes
1 answer
52 views

I am using h2o.ai and a sample credit card dataset to run kmeans clustering. Which model should I use to run K means clustering in h2o.ai? I chose Unsupervised learning. There are 2 options with ...
user26844683's user avatar

1
2 3 4 5
124