1

I already have code which maps to this

['vita', 'oscura', 'smarrita', 'dura', 'forte', 'paura', 'morte', 'trovai', 'scorte', 'v’intrai']

I want this

[('vita','oscura',1),('oscura','smarrita',1),('smarrita','dura',1), ('dura','forte',1) etc

I thought that I could do this via a lambda function, where for every line, i ask for the first row, first item, then I ask for first row second column, which fails bc of an out of index error, any points on how I could go about this?

this is my code so far

def lower_clean_str(x):
  punc='!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
  lowercased_str = x.lower()
  for ch in punc:
    lowercased_str = lowercased_str.replace(ch, '')
  return lowercased_str

clean_dcr=dcr.map(lower_clean_str)
print(clean_dcr.take(10))

#we split on whitespaces as in ex1, notice how this time we take [-1] to grab only the first word
clean_dcr=clean_dcr.map(lambda line: line.split()[-1])
print(clean_dcr.take(10))

#this gives an error
#clean_dcr=clean_dcr.map((lambda line:line[0][0],line[0][1])),1)
#print(clean_dcr.take(3))

1 Answer 1

1

For Python 3.10 and above one can use pairwise

Sample code snippet can be,

import itertools

input_list = ['vita', 'oscura', 'smarrita', 'dura', 'forte', 'paura', 'morte', 'trovai', 'scorte', 'v’intrai']

output = [element + (1, ) for element in itertools.pairwise(input_list)]

For python versions below 3.10 one can use reference implementation of pairwise which is also mentioned in the link

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.