Can someone explain to me please why
B = diag(-2*ones(m,1),0) + diag(ones(m-1,1),-1) + diag(ones(m-1,1),1)
takes 1,22 s. While
A = diag(-2*ones(1,n)) + diag(ones(1,n-1),1) + diag(ones(1,n-1),-1)
takes ~ an hour, for the same n=m=10000
You didn't specify which version of Octave (and/or Matlab?) you are using. It may be that older versions failed to perform internal optimizations to account for column-ordered operations being more memory and speed friendly. Recent versions of both programs show nearly identical run times for the commands you provided.
Testing on recent versions of Octave:
Octave 9.1.0:
----------------------------------------------------------------------
GNU Octave Version: 9.1.0 (hg id: d0c18b1446df)
GNU Octave License: GNU General Public License
Operating System: MINGW32_NT-6.2 Windows 6.2 x86_64
----------------------------------------------------------------------
>> n = m = 10000;
>> tic; B = diag(-2*ones(m,1),0) + diag(ones(m-1,1),-1) + diag(ones(m-1,1),1); toc
Elapsed time is 0.989229 seconds.
>> tic; A = diag(-2*ones(1,n)) + diag(ones(1,n-1),1) + diag(ones(1,n-1),-1); toc
Elapsed time is 0.981692 seconds.
Octave 10.1.0:
----------------------------------------------------------------------
GNU Octave Version: 10.1.0 (hg id: 417c47651ed5)
GNU Octave License: GNU General Public License
Operating System: MINGW32_NT-6.2 Windows 6.2 x86_64
----------------------------------------------------------------------
>> n = m = 10000;
>> tic; B = diag(-2*ones(m,1),0) + diag(ones(m-1,1),-1) + diag(ones(m-1,1),1); toc
Elapsed time is 1.24033 seconds.
>> tic; A = diag(-2*ones(1,n)) + diag(ones(1,n-1),1) + diag(ones(1,n-1),-1); toc
Elapsed time is 1.29325 seconds.
and a recent nightly build, soon to become 10.2.0:
----------------------------------------------------------------------
GNU Octave Version: 10.1.1 (hg id: 34e0522499ae)
GNU Octave License: GNU General Public License
Operating System: MINGW32_NT-6.2 Windows 6.2 x86_64
----------------------------------------------------------------------
>> n = m = 10000;
>> tic;B = diag(-2*ones(m,1),0) + diag(ones(m-1,1),-1) + diag(ones(m-1,1),1); toc
Elapsed time is 1.06365 seconds.
>> tic;A = diag(-2*ones(1,n)) + diag(ones(1,n-1),1) + diag(ones(1,n-1),-1); toc
Elapsed time is 1.06898 seconds.
Checking a recent version of Matlab:
Matlab 2024b:
-----------------------------------------------------------------------------------------------------
MATLAB Version: 24.2.0.2712019 (R2024b)
Operating System: Microsoft Windows 11 Enterprise Version 10.0 (Build 22631)
-----------------------------------------------------------------------------------------------------
>> n = 10000; m = n;
>> tic;B = diag(-2*ones(m,1),0) + diag(ones(m-1,1),-1) + diag(ones(m-1,1),1); toc
Elapsed time is 0.530463 seconds.
>> tic;A = diag(-2*ones(1,n)) + diag(ones(1,n-1),1) + diag(ones(1,n-1),-1); toc
Elapsed time is 0.527413 seconds.
ones(m,1)is nice and local cache friendly consecutive memory accesses and thatones(1,n)has significant extra admin overheads. I'm a bit surprised the difference in speed is quite so large.