Summing list to specific treshold with zeros trace

Question

Is there a reasonable way to get the following done in fast compilation way?

I try to sum the list of numbers to specific treshold and replace previous values to 0. I'm looking for the fastest compilation way (the list has 18 kk records).

For given example the treshold is "1".

Input:

[0.2, 0.4, 0.2, 0.2, 0.1, 1.2, 3.2 ,0.2, 0.1, 0.4, 0.5, 0.1]

Output:

[0.0, 0.0, 0.0, 1.0, 0.0, 1.3, 3.2 ,0.0, 0.0, 0.0, 1.2, 0.1]

What is "kk"?..

Kelly Bundy
– Kelly Bundy

2023-02-19 18:18:06 +00:00
Commented Feb 19, 2023 at 18:18 — Kelly Bundy
– Kelly Bundy, Commented Feb 19, 2023 at 18:18
And what are "reasonable" and "fast" for you?

Kelly Bundy
– Kelly Bundy

2023-02-19 18:22:17 +00:00
Commented Feb 19, 2023 at 18:22 — Kelly Bundy
– Kelly Bundy, Commented Feb 19, 2023 at 18:22
What do you mean with "compilation way"?

Kelly Bundy
– Kelly Bundy

2023-02-19 18:38:14 +00:00
Commented Feb 19, 2023 at 18:38 — Kelly Bundy
– Kelly Bundy, Commented Feb 19, 2023 at 18:38
Python is interpreted - i.e., it is not compiled

jackal
– jackal

2023-02-19 18:46:13 +00:00
Commented Feb 19, 2023 at 18:46 — jackal
– jackal, Commented Feb 19, 2023 at 18:46
@KellyBundy 1. 18 millions, 2. 0-2 h for example.

na ni
– na ni

2023-02-19 18:59:58 +00:00
Commented Feb 19, 2023 at 18:59 — na ni
– na ni, Commented Feb 19, 2023 at 18:59

RomanPerekhrest · Accepted Answer · 2023-02-19 18:47:43Z

1

A more faster approach compared to appending each interim value to the final list:

lst = [0.2, 0.4, 0.2, 0.2, 0.1, 1.2, 3.2 ,0.2, 0.1, 0.4, 0.5, 0.1]

res = [0] * len(lst)  # initial zeros
L_size, t = len(lst), 0
for i, n in enumerate(lst):
    t += n
    if t >= 1 or i == L_size - 1:
        res[i] = t
        t = 0
print(res)

[0, 0, 0, 1.0, 0, 1.3, 3.2, 0, 0, 0, 1.2000000000000002, 0.1]

answered Feb 19, 2023 at 18:47

RomanPerekhrest

93.1k4 gold badges75 silver badges112 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Andrej Kesely Over a year ago

Maybe even faster would be to update the lst in-place: lst[i] = t and in else block lst[i] = 0

RomanPerekhrest Over a year ago

@AndrejKesely, that indeed could be an option if mutating the original list is acceptable (or desirable)

Kelly Bundy Over a year ago

@AndrejKesely Seems to make it slower, see my benchmark.

Andrej Kesely Over a year ago

@KellyBundy Yep, that's true. Thanks for the benchmark! (when in doubt, benchmark it :)

Kelly Bundy · Accepted Answer · 2023-02-19 19:57:08Z

A list comprehension:

s = 0.0
res = [
    0.0 if (s := s + x if s < 1.0 else x) < 1.0 else s
    for x in lst
]
res[-1] = s

Benchmark with ~1.8 million values, times multiplied by 10 to estimate for your 18 million:

1.74 ± 0.06 seconds  Kelly
3.32 ± 0.10 seconds  Roman
3.56 ± 0.10 seconds  Roman_Andrej
5.17 ± 0.07 seconds  mozway

Benchmark code (Attempt This Online!):

from timeit import timeit
from statistics import mean, stdev

def mozway(l):
    total = 0
    out = []
    for i, n in enumerate(l):
        new_total = total + n
        if new_total >= 1 or i+1 == len(l):
            out.append(new_total)
            total = 0
        else:
            out.append(0)
            total = new_total
    return out

def Roman(lst):
    res = [0] * len(lst)
    L_size, t = len(lst), 0
    for i, n in enumerate(lst):
        t += n
        if t >= 1 or i == L_size - 1:
            res[i] = t
            t = 0
    return res

def Roman_Andrej(lst):
    L_size, t = len(lst), 0
    for i, n in enumerate(lst):
        t += n
        if t >= 1 or i == L_size - 1:
            lst[i] = t
            t = 0
        else:
            lst[i] = 0
    return res

def Kelly(lst):
    s = 0.0
    res = [
        0.0 if (s := s + x if s < 1.0 else x) < 1.0 else s
        for x in lst
    ]
    res[-1] = s
    return res

funcs = mozway, Roman, Roman_Andrej, Kelly

lst = [0.2, 0.4, 0.2, 0.2, 0.1, 1.2, 3.2 ,0.2, 0.1, 0.4, 0.5, 0.1]
exp = [0.0, 0.0, 0.0, 1.0, 0.0, 1.3, 3.2 ,0.0, 0.0, 0.0, 1.2, 0.1]

for f in funcs:
    res = [round(x, 6) for x in f(lst[:])]
    print(res == exp)
  #  print(exp)
  #  print(res)

times = {f: [] for f in funcs}
def stats(f):
  ts = [t for t in sorted(times[f])[:5]]
  return f'{mean(ts):4.2f} ± {stdev(ts):4.2f} seconds '

lst *= 1800000 // len(lst)
for _ in range(10):
  for f in funcs:
    copy = lst[:]
    t = timeit(lambda: f(copy), number=1) * 10
    times[f].append(t)

for f in sorted(funcs, key=stats):
  print(stats(f), f.__name__)

mozway · Accepted Answer · 2023-02-19 18:48:44Z

0

Not sure what you mean by "fast compilation way":

l = [0.2, 0.4, 0.2, 0.2, 0.1, 1.2, 3.2 ,0.2, 0.1, 0.4, 0.5, 0.1]

total = 0
out = []
for i, n in enumerate(l):
    new_total = total + n
    if new_total >= 1 or i+1 == len(l):
        out.append(new_total)
        total = 0
    else:
        out.append(0)
        total = new_total

Output:

[0, 0, 0, 1.0, 0, 1.3, 3.2, 0, 0, 0, 1.2000000000000002, 0.1]

Running time for 18K values:

8.16 ms ± 296 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

precision

If you need precise floating point operations you might want to use Decimal:

from decimal import Decimal
from math import isclose

total = 0
out = []
for i, n in enumerate(map(Decimal, l)):
    new_total = total + n
    if new_total >= 1 or isclose(new_total, 1) or i+1 == len(l):
        out.append(float(new_total))
        total = 0
    else:
        out.append(0)
        total = new_total

Output:

[0, 0, 0, 1.0, 0, 1.3, 3.2, 0, 0, 0, 1.2, 0.1]

Running time for 18K values:

49.5 ms ± 2.93 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

edited Feb 19, 2023 at 18:48

answered Feb 19, 2023 at 18:11

mozway

267k13 gold badges56 silver badges106 bronze badges

2 Comments

Kelly Bundy Over a year ago

For input l = [0.3, 0.7], the first solution gives [0, 1.0] and the "precise" second solution gives [0, 0.9999999999999999].

mozway Over a year ago

@KellyBundy I knew I'd get such comment when I added this, I didn't mean precision in terms of single float representation, but rather for the repeated addition of floats. If I'm not mistaken Decimal is more precise for that than floats (the ideal would be to use Decimal from the beginning and not to convert from existing floats).

Collectives™ on Stack Overflow

Summing list to specific treshold with zeros trace

3 Answers 3

4 Comments

Comments

precision

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

precision

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related