2,467 questions
2
votes
0
answers
107
views
Estrin polynomial evaluation - surprising interaction with loop unrolling
This simple Estrin evaluation of a polynomial shows a curious loop unrolling trick by MSVC that works surprisingly well. My problem is that I cannot understand why it works at all. There seem to be no ...
Advice
0
votes
2
replies
75
views
Influence of Round-Off Errors on the von Neumann Stability Analysis
I am studying the explicit finite difference scheme for the 1D heat equation
and I want to analyze the effect of round-off errors when computations are performed in Float16.
I consider the update:
u_i^...
8
votes
2
answers
247
views
Compute sin(x)-x efficiently to double precision accuracy on range |x| <= pi
There is a trig related function missing from math.h, namely x-sin(x). I am trying to implement it accurate to full double precision for |x| <= pi. Minimum ripple polynomial or rational ...
3
votes
3
answers
134
views
Numerically finding the minimum value of a variable where a function equals a value
I have a black box function baseball that determines the magnitude of the distance that a baseball travels when launched. It takes the initial speed, the angle of launch and the initial spin. Assuming ...
4
votes
1
answer
301
views
Why does Newton’s method overshoot on the first deceleration step in my motion profile generator?
I’m porting a Python motion profile generator to C to implement for my STM32H743. The generator produces step timings for a simple acceleration → cruise → deceleration motion profile. See the ...
2
votes
2
answers
133
views
Estrin polynomial evaluation - Intel ICX strange slowdown for lengths 16N-1
Porting what I believe to be working code for a radix-2 Estrin polynomial evaluation scheme to the Intel ICX 2024 compiler I am seeing a slowdown for a handful of polynomial lengths just below 16*N (...
1
vote
1
answer
125
views
integer exact computation with logs
I need to compute ceil(log_N(i)) where log_N is the log with positive integer base N and i is also a positive integer.
The straight forward python implementation using floating point math fails at ...
1
vote
0
answers
66
views
Is it possible to reduce the memory requirements for (I)FFT by chunking?
I'm trying to run an IFFT on a very long signal (10^9 data points) and I'm running into RAM constraints. I'd like to be able to chop up the signal and run it in batches (as I am not time constrained ...
6
votes
1
answer
149
views
What is wrong with my block GMRES implementation?
I'm trying to implement a block GMRES procedure (i.e., GMRES to solve Ax=b, but with b that is not a vector but a n x r matrix, where r << n). My goal is to have a first implementation in Python ...
-1
votes
2
answers
196
views
Is there a faster method to compute these distance-based sums over a nonuniform grid of cells?
I'm attempting to implement an algorithm to compute an array of values. Unfortunately, I've found the literal implementation of that algorithm to be too slow for my use case. I suspect there's an ...
4
votes
0
answers
127
views
MSVC fixed short length polynomial evaluation curiosity - why does it not keep my coefficients array contiguous, and why subtract the absolute value?
MSVC seems to be taking the values from my array of coefficients and scattering them around in its .rdata section, not keeping them contiguous even though they're all used together. And it takes the ...
2
votes
2
answers
128
views
Differentiating OdeSolution object
I would like to compute the residual of the numerical solution to an ordinary differential equation.
Here is the code
import numpy as np
from scipy.integrate import solve_ivp
def f(t, x):
return ...
0
votes
0
answers
69
views
MICCG(0) for a fluid simulation fails at Neumann boundaries
I am trying to follow Robert Bridson's Fluid Simulation Notes (https://www.cs.ubc.ca/~rbridson/fluidsimulation/fluids_notes.pdf) to implement my own eulerian fluid simulator for the first time.
I was ...
-3
votes
3
answers
301
views
Taylor Series in Java using float [closed]
I am implementing a method berechneCosinus(float x, int ordnung) in Java that should approximate the cosine of x using the Taylor series up to the specified order.
My questions:
Can I optimize the ...
0
votes
0
answers
52
views
Problem with multigrid method implementation in MATLAB
I'm implementing a 2-grid method for solving discrete Poisson equation. I have read many papers to understand multigrid methods, for this implementation I'm following an example from this paper (p. 10)...
1
vote
0
answers
81
views
LAPACK Inconsistent across multiple different operating systems and devices
Description
I have a deterministic program that uses jax, and is heavy on linear algebra operations.
I ran this code on CPU, using three different CPUs. Two MacOs Systems (one on Sequoia (M1 Pro), ...
0
votes
1
answer
114
views
How to python code a distributed delayed differential system in python?
Already tried with ddeint, does not yield good results. I have an equivalent PDE system that works just fine and I can compare results. The key of the problem is that the system is coupled, so I ...
1
vote
1
answer
95
views
Reducing to Hessenberg form using Givens rotations
I'm trying to implement a function in Python that reduces a square matrix to upper Hessenberg form using Givens rotations. I know that this should be possible Wiki and sometimes preferred over ...
4
votes
0
answers
107
views
Issues with computing determinant of high dimensional symbolic matrix
I am trying to solve the equation,
|λ²·A + λ·B + C| = 0,
where lambda is a (possibly complex) scalar and A, B, and C are N x N matrices. The goal is to compute the roots of the characteristic ...
0
votes
1
answer
195
views
One adaptive-time-step solution for ODE
I want to solve some ODE just for one time step. The time step is decided internally by the solver, not by me. Is there a python adaptive-time-step ode solver which could do this? For illustration the ...
0
votes
1
answer
63
views
Numerically stable noncentral chi-squared distribution in torch?
I need numerically stable non-central chi2 distribution in torch.
2
votes
1
answer
76
views
Volume preservation in phase space in leapfrog algorithm
Based on the information provided in these lecture notes, the leapfrog integration algorithm should preserve area (for single-body systems) in phase space.
I implemented the algorithm for a Coulomb ...
4
votes
3
answers
311
views
How can I further optimize this C++ function that takes integer square roots?
In an algorithm I am using, most of the time is spent in the evaluation of square roots using the sqrt function in cmath. In my application I only need the integer part of the square root of integers, ...
7
votes
2
answers
863
views
Did I beat Remez in sine approximation?
First, it is the most compact sine approximation ever. It seems I can do better than Remez in terms of precision/performance. Here the [0,pi/2] approximation range.
double p[] =
-0....
2
votes
1
answer
78
views
Implementation of Implicit Runge–Kutta
I’ve written a MATLAB implementation of an implicit Runge–Kutta method (Radau IIA). It behaves well on most test problems, but it refuses to converge on the following very stiff equation like
y' = y^2 ...
0
votes
0
answers
80
views
Compute Double integral of a function f(x,y) giving its values on a grid with python
For a grid ((xi,yj))i,j, and a function f with well knows values at points (xi,yi). We want to compute the double integral of f according to a measure phi(y)dxdy, where phi is a normalized function.
...
0
votes
1
answer
64
views
Cutover point between two CDF implementations
My goal is to implement a CDF (cumulative distribution function) for the Standard Normal Distribution and to do so with the best possible numeric accuracy.
This is my starting point: (1 + erf(x / sqrt(...
4
votes
5
answers
340
views
Why no floating point error occurs in print(0.1* 100000) vs (Decimal(0.1)*100000) due to FP representation of 0.1?
I am studying numerical analysis and I have come across this dilemma.
Running the following script,
from decimal import Decimal
a = 0.1 ;
N = 100000 ;
# product calculation
P = N*a
# Print product ...
2
votes
2
answers
121
views
Find correct root of parametrized function given solution for one set of parameters
Let's say I have a function foo(x, a, b) and I want to find a specific one of its (potentially multiple) roots x0, i.e. a value x0 such that foo(x0, a, b) == 0. I know that for (a, b) == (0, 0) the ...
0
votes
0
answers
69
views
Phase Portrait of Coupled ODEs Not Matching Expected Graph in Python (SciPy)
I’m trying to generate a phase portrait for a system of coupled ordinary differential equations (ODEs) using Python’s scipy.integrate.solve_ivp. The system models the frequency of cooperators (x) and ...
0
votes
1
answer
89
views
Difference in output of sympy.binomial vs scipy.special.comb
Why does the following code that calls comb produce the wrong output?
from scipy.special import comb
from sympy import binomial
def coup(n=100, m=10):
expectation = 0
for j in range(1, n + 1):
...
1
vote
0
answers
75
views
multivariate Fox-H numerical computation
I have some troubles in numerically calculating PDF below:
SNR of particular system model
This PDF is the expression of SNR of practical RIS-aided system: https://ieeexplore.ieee.org/document/9387559 (...
0
votes
1
answer
189
views
Smooth numerical AND function with N > 2 parameters for a C++ optimization engine
Consider an optimization engine that uses target functions and constraints that requires smooth functions (with at least first-order continuous derivative) to work well.
Now consider a boolean ...
0
votes
0
answers
76
views
How to simulate in python the superconducting density of states of a BCS superconductor with YSR (yu-shiba-rusinov) with 1D chain tight binding model
I try to simulate the following problem:
Compute the density of states numerically of a BCS superconductor with magnetic impurities using the tight binding model of 1D chain with periodically border ...
0
votes
0
answers
207
views
Minimum image convention for a MD simulation in Python
I'm trying to implement a simple molecular dynamic simulation in Python for N atoms with periodic boundary conditions using minimum image convention.
Here is my code:
import numpy as np
import ...
0
votes
0
answers
60
views
Simple MD simulation for N atoms with PBC in Python
I'm trying to implement a simple molecular dynamic simulation in Python for N atoms with periodic boundary conditions while allowing coordinate wrapping to visualize the atoms interacting through
...
0
votes
1
answer
162
views
Molecular dynamic simulation using velocity verlet in Python
I'm trying to implement a simple MD simulation in Python (I'm new to this),I'm using LJ potential and force equations along with Verlet method.
I posted another question about the same code yesterday ...
1
vote
1
answer
89
views
MD simulation using velocity verlet in Python
I'm trying to implement a simple MD simulation in Python (I'm new to this),I'm using LJ potential and force equations along with Verlet method:
def LJ_VF(r):
#r = distance in Å
#Returns V in (...
1
vote
3
answers
202
views
With IEEE 754 floating point numbers, is underflow still better than overflow? [closed]
In his classic works from the 1970s, Richard Hamming advises setting up floating point calculations to be more likely to underflow than overflow.
For example, if you want to calculate 1 - 1/(exp(x)+1),...
0
votes
1
answer
42
views
Steps approximation for time series scatter with mean changing every K number of steps using BIC
First, the synthetic data used is generated the following the way:
import sympy as sp
import numpy as np
import matplotlib.pyplot as plt
import random
import math
np.random.seed(2)
n_samples = 180
...
2
votes
1
answer
97
views
How to solve the derivative of the expression with respect to D?
The integral is
Where the variables follow the following distribution:
Thus, the integral becomes:
Now, my code in python is to integrate the expression with \(W_{1:3}\) being \(1,2,3\) ...
0
votes
2
answers
66
views
Computing an iterated integral by iteratively applying the 1D trapezoidal rule
I have a Python function, called trap_1D, which computes the (approximate) integral of a univariate function f over an interval [a,b] using the composite trapezoidal rule. Now I want to evaluate an ...
2
votes
1
answer
117
views
How can I approximate 255/sqrt(x) using Newton's method?
I am trying to approximate 255 / sqrt(x) using Newton's method to avoid using division or the sqrt() operation on an architecture where those operations are expensive.
My derivation is:
y = 255 / sqrt(...
2
votes
1
answer
199
views
Problem with mismatched length when using a mask
I'm writing a code and I have a function that calculates the values that are not fulfilling a condition with the values that are fulfilling the condition, but I'm having a lot of trouble with managing ...
0
votes
1
answer
161
views
Numerical instability in forward-backward algorithm for Hidden Markov Models
I am implementing the forward algorithm for Hidden Markov Models (see below for the algorithm). To prevent over/underflow, I work with log-probabilities instead, and use the log-sum-exp trick to ...
3
votes
2
answers
239
views
Using SciPy minimize to solve the Catenary problem
I am attempting to use SciPy optimization to find the function that minimizes the potential energy of a hanging rope between 2 points, the catenary problem. The objective function is defined by
The ...
2
votes
2
answers
211
views
Algorithm for calculating fraction from decimal with limits
The algorithm for calculating a fraction from a decimal number is known and there are enough examples for that. In my case I need to find a fraction for a given decimal where A = N/D is optimized but ...
2
votes
0
answers
146
views
Adams-Bashforth coefficients
I implemented a first draft of the Adams-Bashforth multistep-solver in C++. To make this method work, you need coefficients. In Wikipedia these are the bj in this formula:
y_n+1 = y_n + h * sum_over_j(...
1
vote
2
answers
189
views
Problem with least squares rational approximation to `asin(x)+ sqrt(1-x^2)` in [3,1] form
I'm trying to generate a decent [3,1] rational least squares polynomial approximation to asin(x)+sqrt(1-x^2) on [0,1] and failing dismally :(
The problem is that it has a pole for this particular ...
1
vote
0
answers
103
views
Solving Nonlinear Boundary Value Problem with Constraints
I want to solve the following boundary value problem
where S, k_i, x_f, α_1, and θ are known parameters. We are trying to solve for h(x), p, and θ_d.
My idea was to use finite differences to create a ...