3

I have a search and test problem : in the list of prime numbers from 2 to 100k, we're searching the first set of 5 with the following criteria :

  • p1 < p2 < p3 < p4 < p5
  • any combination of 2 primes from the solution (3 and 7 => 37 and 73) must also be a prime
  • sum(p1..p5) is the smallest possible sum of primes satisfying criteria, and is above 100k

I can totally code such a thing, but i have a severe optimization problem : my code is super duper slow. I have a list of primes under 100k, a list of primes over 100k, and a primality test which works well, but i do not see how to optimize that to obtain a result in a correct time.

For a basic idea :

  • the list of all primes under 100k contains 9592 items
  • the list of all primes under 1 billion contains approximately 51 million lines
  • i have the list of all primes under 1 billion, by length

Thanks for the help

17
  • 4
    This may be better suited to codereview because your code works, it's just slow. Commented Dec 1, 2023 at 14:33
  • 2
    Formulation as a graph problem: you're looking for 5-cliques in the graph where vertices are prime numbers below 100k, and two vertices share an edge iff their concatenations are prime numbers. Commented Dec 1, 2023 at 15:39
  • 1
    you need the primes up to 10E9 to do the test if p4-p5 or p5-p4 is a prime Commented Dec 1, 2023 at 15:43
  • 1
    What does "super duper slow" mean to you? How long does it take to run? Commented Dec 2, 2023 at 0:11
  • 2
    This question is too broad as currently written. It's also unclear exactly what you're asking. if you have code that works, why isn't that code here in the question? If you have working code and just want it to be improved, consider asking on Code Review or provide specific, measurable metrics by which others can consider the code "improved". Commented Dec 14, 2023 at 19:15

1 Answer 1

3

Here is a version that computes the minimal combination of the prime numbers with restrictions as stated in the question.

The majority of runtime is spent in pre-computing all valid combinations of prime numbers. On my computer (AMD 5700X) this runs in 1 minute 20 seconds:

import numpy as np
from numba import njit, prange


@njit
def prime(a):
    if a < 2:
        return False
    for x in range(2, int(a**0.5) + 1):
        if a % x == 0:
            return False
    return True


@njit
def str_to_int(s):
    final_index, result = len(s) - 1, 0
    for i, v in enumerate(s):
        result += (ord(v) - 48) * (10 ** (final_index - i))
    return result


@njit
def generate_primes(n):
    out = []
    for i in range(3, n + 1):
        if prime(i):
            out.append(i)
    return out


@njit(parallel=True)
def get_comb(n=100_000):
    # generate all primes < n
    primes = generate_primes(n)
    n_primes = len(primes)

    # generate all valid combinations of primes
    combs = np.zeros((n_primes, n_primes), dtype=np.uint8)

    for i in prange(n_primes):
        for j in prange(i + 1, n_primes):
            p1, p2 = primes[i], primes[j]

            c1 = str_to_int(f"{p1}{p2}")
            c2 = str_to_int(f"{p2}{p1}")

            if not prime(c1) or not prime(c2):
                continue

            combs[i, j] = 1

    all_combs = []

    for i_p1 in prange(0, n_primes):
        for i_p2 in prange(i_p1 + 1, n_primes):
            if combs[i_p1, i_p2] == 0:
                continue
            for i_p3 in prange(i_p2 + 1, n_primes):
                if combs[i_p1, i_p3] == 0:
                    continue
                if combs[i_p2, i_p3] == 0:
                    continue
                for i_p4 in prange(i_p3 + 1, n_primes):
                    if combs[i_p1, i_p4] == 0:
                        continue
                    if combs[i_p2, i_p4] == 0:
                        continue
                    if combs[i_p3, i_p4] == 0:
                        continue
                    for i_p5 in prange(i_p4 + 1, n_primes):
                        if combs[i_p1, i_p5] == 0:
                            continue
                        if combs[i_p2, i_p5] == 0:
                            continue
                        if combs[i_p3, i_p5] == 0:
                            continue
                        if combs[i_p4, i_p5] == 0:
                            continue

                        p1, p2, p3, p4, p5 = (
                            primes[i_p1],
                            primes[i_p2],
                            primes[i_p3],
                            primes[i_p4],
                            primes[i_p5],
                        )

                        ccomb = np.array([p1, p2, p3, p4, p5], dtype=np.int64)
                        if np.sum(ccomb) < n:
                            continue

                        all_combs.append(ccomb)
                        print(ccomb)
                        break

    return all_combs


all_combs = np.array(get_comb())
print()
print("Minimal combination:")
print(all_combs[np.sum(all_combs, axis=1).argmin()])

Prints:

[    3 28277 44111 70241 78509]
[    7    61 25939 26893 63601]
[    7    61 25939 61417 63601]                     
[    7    61 25939 61471 86959]                     
[    7  2467 24847 55213 92593]                     
[    7  3361 30757 49069 57331]                

...

[ 1993 12823 35911 69691 87697]
[ 2287  4483  6793 27823 67723]
[ 3541  9187 38167 44257 65677]
                                                    
Minimal combination:                                
[   13   829  9091 17929 72739]
                                                    
real    1m20,599s                                   
user    0m0,011s                                    
sys     0m0,008s                                    
Sign up to request clarification or add additional context in comments.

3 Comments

your answer was perfect, right on spot, and super fast (just under a minute on a I7). I would definitely look into numba, prange and njit. Thanks a lot !
range(2, int(a**0.5) + 1) why test with all the even numbers in this range, itertools.chain([2], range(3,int(a**0.5)+1,2)), or you only have to use the prime numbers you already found that are <int(a**0.5)+1
def generate_primes(n): return [i for i in range(3, n+1,2) if prime(i)]

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.