0

I am new to python and came across a object deserialisation issue (unpickling) while testing a program on jupyter lab.

I am trying to serialize and deserialize object of Employee class as below.

- Definition of Employee class:

class Employee:
    def __init__(self, id, name, salary):
        self.id = id
        self.name = name
        self.salary = salary


    def display(self):
        print('{:5d} -- {:20s} -- {:10.2f}'.format(self.id, self.name, self.salary))

- Code to pickle Employee Object:

import pickle

file = open('employee-data.csv', 'wb')

n = int(input('How many employees ?'))

for i in range(n):
    id = int(input('Enter the Employee id:'))
    name = input('Enter the Employee name:')
    salary = float(input('Enter the Employee salary:'))

    ob = Employee(id, name, salary)

    pickle.dump(ob,file)

file.close()

- Code to unpickle Employee Object:

import pickle

file2 = open('employee-data.csv', 'rb')

print('Employee Details ....')

while True:
    try:
        obj = pickle.load(file2)
        obj.display()
    except EOFError:
        print('End of File Reached ...')
        break

file2.close()

Error:

> ---------
UnpicklingError   Traceback (most recent call
> last) Cell In[5], line 9
>       7 while True:
>       8     try:
> ----> 9         obj = pickle.load(file2)
>      10         obj.display()
>      11     except EOFError:
> 
> UnpicklingError: invalid load key, '\xef'.

Following snapshot shows that the code to serialize the object has run successfully. And the file was created.

enter image description here

What can be the problem here ? Any suggestions/feedbacks appreciated.

8
  • 2
    Cannot reproduce (works for me). Can you boil down the problem to a minimal, reproducible example (e.g. provide the data for one or two particular Employee instances that make the code fail, the creation of which is preferably hard-coded and not via input())? Maybe also add OS and Python version info. As a side-note: I would not call the file *.csv, since pickle does not produce CSV files (i.e. comma-separated text files). Commented Jul 31 at 7:48
  • 1
    There are few similar occurences that said it is closed, that could help you. refer: github.com/CSAILVision/places365/issues/49 also the way you store the pickle file is wrong, use a .pkl extension for saving pickle file and not .csv, relook the syntax on how you are opening and closing the file, try to use 'with' which handles all these things easily for you. Commented Jul 31 at 7:54
  • 1
    Works just fine here. Commented Jul 31 at 8:36
  • 2
    @GunjanShah Appreciate that it works now. I do not think this is related to the suffix, though. Most likely, pickle.dump() or file.close() was not completed, so a corrupted file was written. I guess you now wrote a completely new file rather than just changing the suffix from .csv to .pkl? To be more error-proof, I agree with Surya R's comment: use a with open(...) as file: context for reading and writing, rather than calling close() explicitly. See example here. I won't add an answer, since I don't think I provided a solution. Commented Jul 31 at 8:43
  • 1
    @Simon, It might sound weird but I did not change anything in code except the file extension. But this seems problem with the Jupyter lab platform (jupyter.org/try-jupyter/lab). I tried the same code on PyCharms IDE and it works fine with CSV extension. Commented Jul 31 at 8:56

1 Answer 1

2

Your code works, I will not comment on that. There are other comments that already suggest improvements.

The error you're encountering is not due to the logic, but rather due to how the file was saved or modified afterward.

The error message suggests that the file contains unexpected bytes at the beginning, specifically the UTF-8 BOM (Byte Order Mark). This can happen if:

  • You opened employee-data.csv in a text editor (like Notepad or Excel), and

  • The editor silently saved the file with a different encoding - typically UTF-8 with BOM.

When this happens, the first few bytes of the file (\xef\xbb\xbf) are not valid in the context of a binary pickle stream, hence the UnpicklingError.

Try to save the file in a different format (e.g. ANSI) or regenerate the file without opening it before loading it again.

Sign up to request clarification or add additional context in comments.

1 Comment

Well spotted! I can reproduce OP's error by inserting a BOM via bom_file = open('employee-data.csv', 'r+b'); bom = b'\xef\xbb\xbf'; content = bom_file.read(); bom_file.seek(0); bom_file.write(bom + content); bom_file.close() between the pickling and unpickling code.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.