Skip to main content
Filter by
Sorted by
Tagged with
1 vote
1 answer
56 views

Pardon my verbosity but I think that I've much to explain. My English leaves something to desire. I'm 66 years old, from Italy, with some experience in programming, but I never ventured in the realm ...
Cesare Brizio's user avatar
2 votes
0 answers
96 views

I am grouping my data using the ward.D2 hierarchical clustering method in R. I need to calculate the distance and similarity for each step, from 2 to 20 clusters. Similarity is calculated using the ...
pnlp's user avatar
  • 35
1 vote
1 answer
157 views

I'm trying to understand the difference between the vector.similarity.cosine Cypher function and the gds.similarity.cosine function in Neo4j. According to the Neo4j documentation, both are used to ...
Gal Shubeli's user avatar
1 vote
1 answer
207 views

I have been developing a matching system which matches the rows of the client and our central database depending on similarity. I have used a hybrid approach where I needed to somehow map the Company, ...
Prabhjit Singh's user avatar
1 vote
0 answers
174 views

I’ve been working with Qwen2.5-VL and Gemma3 locally, and I need to measure the similarity between text and image embeddings—similar to CLIP/SigLIP—but I’m resource-limited and can’t spin up ...
H.H's user avatar
  • 11
1 vote
3 answers
94 views

How can I perform a GROUP BY in SQL when the group_name values are similar but not exactly the same? In my dataset, the group_name values may differ slightly (e.g., "Apple Inc.", "...
Ahamad's user avatar
  • 1
0 votes
0 answers
91 views

This question builds on a helpful solution provided for calculating uniqueness across a SpatRaster -- Using terra in R to calculate & map similarity and uniqueness across cells of a large GDM-...
Sean Basquill's user avatar
0 votes
1 answer
197 views

I am looking to employ terra to update an analytical workflow (from Mokany et al 2022; Glob Ecol and Biogeo) originally written in R with the raster package. The workflow involves spatial analyses of ...
Sean Basquill's user avatar
0 votes
0 answers
41 views

I have a ticketing system where people create ticket for their issue. When someone is trying to create a new ticket I have to search my elastic search to identify whether a similar ticket is already ...
Ramji's user avatar
  • 75
0 votes
1 answer
74 views

I'm interested in finding similarities between two lists. I have the count of duplicates in the first column, and the pattern is in the second column. What would be the most logical way to compare ...
rollTHERoad's user avatar
0 votes
0 answers
99 views

I am appending each row in csv file into chromadb with format, such as below; #Acme1 prod #Acme1 line, AC1 #Acme2 prod #Acme2 line, AC2 embedding = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-...
Orkun Gedik's user avatar
0 votes
1 answer
74 views

I have dataframe with 1000 text rows. df['text'] I also have 5 words that I want to know for each one of them how much they represnt the text (between 0 to 1) every score will be in df["word1&...
rafine's user avatar
  • 469
1 vote
1 answer
64 views

I have two data frames: ref_df containing for each row information about species and the latitude and longitude they was recorded and sample_df with sample names as rows and species names as columns,...
Elie Tièche's user avatar
0 votes
1 answer
84 views

I have dataframe with 1000 text rows. I did word2vec . Now I want to create a new field which give me the distance from each sentence to the word that i want, lets say the word "king". I ...
rafine's user avatar
  • 469
3 votes
1 answer
66 views

I want to utilize BERT to assess the similarity between two pieces of text: from transformers import AutoTokenizer, AutoModel import torch import torch.nn.functional as F import numpy as np tokenizer ...
Beitian Ma's user avatar
1 vote
0 answers
32 views

I have a task for image similarity comparisons between two binary images of tactile maps (maps for the visually impaired to map out an area). The goal is to have an output score of how similar the two ...
Michael Liang's user avatar
0 votes
0 answers
28 views

how to compare two blob colomns on the same table ? For example: PIC1 contains a picture of a bridge taken from 5 meter distance, whereas PIC2 contains a picture of the same bridge taken from 10 meter ...
padjee's user avatar
  • 265
1 vote
1 answer
49 views

I want to retrieve all rows from a PostgreSQL database that contain sentences similar to a provided string. The sentences in the database can have their words in any order (rearranged). How can I ...
Shivam Baldha's user avatar
1 vote
0 answers
46 views

I am comparing pairs of identical directed graphs represented in GXL format using the Graph-Matching-Toolkit(https://github.com/dzambon/graph-matching-toolkit). Since the graphs are identical, I ...
NoName Su's user avatar
0 votes
1 answer
42 views

I try to execute the attached code in the following link: https://github.com/biometricsecurity/Preimage-attack-on-BTP-template The code aims to measure the similarity attacks at CB systems. I have ...
GeGe's user avatar
  • 1
1 vote
1 answer
44 views

The agents (called farms) in my model have peers (an agentset of farms they have links to). Each farm has an attribute called rotation, which is a list of values between 0 and 1 (same length for all ...
Bartosz Bartkowski's user avatar
0 votes
1 answer
131 views

Is there any existing optimized Python implementation of Angular Metric for Shape Similarity (AMSS)? Otherwise, could I approximate it by considering the derivative DTW and using cosine similarity ...
hfaila's user avatar
  • 1
1 vote
0 answers
53 views

I am trying to build a small image search program and using Milvus as a database to store my embeddings, on trying to retrieve the result by matching the vector embeddings with the embeddings obtained ...
lazy panda's user avatar
0 votes
1 answer
505 views

I have a single Python function that I am using the embed JSON objects are different lengths. The issue I am having is that, somehow, the dimensions are different when comparing the vectors and I ...
Ken Tola's user avatar
1 vote
0 answers
61 views

I have a boolean array which represents store-availability of retail products across 3000 different stores. so, my schema looks like below: product_id = FieldSchema( name="product_id", ...
Mohamed Niyaz's user avatar
0 votes
0 answers
241 views

I have objects that have many attributes, example: item = { 'id': 123, 'name': 'Keyboard', 'price': 12, 'url': 'example.com', 'description': 'a keyboard that etc...', 'details': { 'color': '...
AbdulmohsenA's user avatar
1 vote
0 answers
87 views

Following on the example here, one way to create a query of the collection from ChromaDB with filtering by a given type of metadata (i.e. "source_type") is results = collection.query( ...
user18959's user avatar
0 votes
1 answer
203 views

I have a dataframe as given below: The table only has values from the upper triangle of a matrix. I want to plot a correlation plot (correlogram) where the colours show the correlation and size ...
Kamalika Ray's user avatar
0 votes
0 answers
119 views

Problem description: I am trying to perform an efficient multi-attribute similarity search across millions of records in a database. However, my process requires an hierarchical order of criteria for ...
AK2001's user avatar
  • 1
1 vote
0 answers
79 views

So I am trying to measure data from a smartphone ambient light sensor (ALS). My goal is to be able to be able to look at the data and be able to infer the location of the device. To do this my plan is ...
nosilak0's user avatar
1 vote
2 answers
150 views

import matplotlib.pyplot as plt import numpy as np from numpy.linalg import norm def cosine_similarity(arr1:np.ndarray, arr2:np.ndarray)->float: dot_product = np.dot(arr1, arr2) magnitude =...
Prashant's user avatar
  • 947
1 vote
2 answers
136 views

Here are some examples of the types of sentences that need to be considered "similar" there was a most extraordinary noise going on shrinking rapidly she soon made out there was a most ...
BLOCKCRAFT 2.0's user avatar
1 vote
0 answers
650 views

So basically I am trying to search a cell line vector data base that has entries that look like this using langchain: ID: 253F1 AC: CVCL_B513 SY: NA OX: NCBI_TaxID=9606; ! Homo sapiens (Human) CA: ...
Nicholas Piccaro's user avatar
-2 votes
1 answer
46 views

Each point has an x, y, and size. For example these should result in similar: Plot 1-A: Plot 1-B: And these should not result in similar: Plot 2-A: Plot 3-A: Are there any algorithms or ways to ...
vnnsnnd's user avatar
0 votes
2 answers
191 views

I want to compare two semantic Knowledge Graphs, to see if they have any triple in common, using cypher. MATCH (n1)-[r1]-(c1) MATCH (n2)-[r2]-(c2) WHERE r1.filePath = "../data/graph1.json" ...
biowhat's user avatar
  • 17
0 votes
1 answer
75 views

I have a dataframe structured the following way with much more rows and columns: Report_ID Block_ID Number Character 1 1 5 A 2 1 3 A 3 1 2 B 4 2 10 A 5 2 11 B 6 2 100 C 7 3 2 D 8 3 #NA A 9 3 8 D 10 3 ...
Marius's user avatar
  • 3
1 vote
2 answers
99 views

I have a list of non conformities appeared in different time with different products. I need to find out similar problems. I already made sorting Now I need to get new sheet with similar rows with ...
Andrei Samoilov's user avatar
2 votes
1 answer
358 views

I want to rate the similarity between two tags. For example the words technology, computer and chip should have high similarity, a word like food should be low similarity. Given the recent ...
Sir hennihau's user avatar
  • 1,874
2 votes
0 answers
41 views

I have a table with three columns: Source, Target, Similarity. The first two are strings, the last one is a float. This table came about by comparing source elements and target elements and finding ...
Mitsarien's user avatar
0 votes
1 answer
2k views

I have a reference image A and 2 target images B and C , I tried to measure the SSIM as follows : (from a human vision perception A & B are from the same class) and A & C from different ...
Ziri's user avatar
  • 746
1 vote
0 answers
132 views

How to run a similarity search in a database and the output should be a table with molecules which passed a specific treshold? I tried this query = sql.SQL(""" SELECT *, ...
Vincent_chem's user avatar
0 votes
0 answers
75 views

I need to query the top few texts that are most similar based on the input content. The table structure is as follows: create table documents ( id bigserial primary ...
accbear's user avatar
  • 23
0 votes
1 answer
79 views

I have this python code trying to find similarities between brands based on the types of products they sell and at what price point. One issue I'm running into is price, sales, and number of products ...
krizik's user avatar
  • 1
0 votes
2 answers
107 views

i have a dataframe as such. id action enc Cell 1 run,swim,walk 1,2,3 Cell 2 swim,climb,surf,gym 2,4,5,6 Cell 3 jog,run] 7,1 This table goes on for roughly 30k rows. After gathering all these actions, ...
Jacob's user avatar
  • 3
1 vote
0 answers
49 views

I have two vectors, for example: a = 25,26,37,36,27,33,104,44,40,49,45,48,50,55,56,59,54,57,105,64,73,76,72,67,68,71,78,82,77,79,86,84,83,85,91,92,96,97,102,101,93,98,99,100,94,95,88,87,65,66,90,89,80,...
purecobalt's user avatar
-3 votes
2 answers
3k views

When using thefuzz in Python to calculate a simple ratio between two strings, a result of 0 means they are totally different while a result of 100 represents a 100% match. What do intermediate results ...
David Shaw's user avatar
1 vote
0 answers
45 views

Let us consider a set of data with, say, 10 values. We need to estimate its characterstics by Monte Carlo method, that is with a large number of randomly generated permutated sets. If we'd consider ...
Stan's user avatar
  • 8,778
1 vote
1 answer
101 views

From the igraph documentation: "The Jaccard similarity coefficient of two vertices is the number of common neighbors divided by the number of vertices that are neighbors of at least one of the ...
Zachary's user avatar
  • 381
0 votes
2 answers
1k views

My programme is chatting with PDF files in a directory. Surprisingly the code works if there 5 PDF files in directory of 1 page each. But it doesn't work when there are 1000 files of 1 page each. It ...
Rajeshwar Singh Jenwar's user avatar
2 votes
2 answers
105 views

I have a data frame called text with two columns, year and text. Find the dput output below for an example: text <- structure(list(year = 2000:2007, text = c("I went to McDonald's and they ...
fsure's user avatar
  • 335

1
2 3 4 5
38