0

i am currently running the below python script:

# Function to process MMR search
def process_mmr_search(row, itemdesc):
    try:
        formatted_itemdesc = str(row[itemdesc])
        print('formatted_itemdesc mmr', formatted_itemdesc)
        docs = indexed_taxonomy_described_cleaned.max_marginal_relevance_search(formatted_itemdesc, 20)
        print('docs mmr',docs)
        return [doc.page_content for doc in docs]
    except Exception as e:
        print(f"Error in MMR search: {e}")
        return []

# Function to handle threading for MMR search
def threaded_mmr_search(index, row, itemdesc):
    mmr_matches = process_mmr_search(row, itemdesc)
    return index, mmr_matches


# Run the MMR search with threading
with ThreadPoolExecutor(max_workers=4) as executor:  # Adjust max_workers based on available resources
    future_mmr = {executor.submit(threaded_mmr_search, index, row, 'Material Description'): index for index, row in spend_sheet_uniques.iterrows()}
    
    for future in as_completed(future_mmr):
        index, mmr_matches = future.result()
        spend_sheet_uniques.at[index, 'Best_Matches_MMR'] = str(mmr_matches)

Objective: spend_sheet_uniques is a dataframe, the whole logic is to just perform similarity search for each row in that dataframe , the embedding is FAISS.

Issue: After executing some rows the application just hangs and doesnt move forward, there is no specific row it stops at, it is different in different times, rarely it processes all the rows.

1 Answer 1

0

There is nothing wrong with how you use your thread pool, this looks like a deadlock issue. To force tasks isolation, you can try to use a ProcessPoolExecutor instead :

if __name__ == "__main__":
    with ProcessPoolExecutor(...

Note: as your tasks look cpu-bound, parallelism ("true concurrency") via processes is probably what you need anyway, right? (or disabling the GIL).

Sign up to request clarification or add additional context in comments.

1 Comment

I did try with ProcessPoolExecutor, but the issue still persists

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.