1

UPDATED Question: In Python, can I create a custom data structure that is a dict, but which I get and set as a set, and for which I can create a custom __str__ representation?

I want a class attribute that is structurally a dict{str:list[str]}, but which the user-interface (for lack of better words) treats like a set[str]. And I want to print it like a dict with custom formatting.

Attempted Solution: I implemented a Descriptor, but I haven't figured out how to customize the __str__, so I'm thinking a Descriptor is not actually what I should be trying.

class TreatDictLikeSet():  # The Descriptor I wish existed
    def __set_name__(self, owner, name):
        self.name = name

    def __get__(self, obj, type=None) -> object:
        my_dict = obj.__dict__.get(self.name) or {}
        return [e for v in my_dict.values() for e in v]

    def __set__(self, obj, value) -> None:
        value = ...<rules to insert set values into a dict>...
        obj.__dict__[self.name] = value


class Foo():
    my_dict = TreatDictLikeSet()
7
  • Please fix your question so that the code is actually syntactically valid and functional. It is also confusing because it seems like parts of the code you show are unrelated to your question about the __str__ method. I suggest you read the article on How to create a Minimal, Reproducible Example and then Edit your question accordingly. Also please decide what your actual question is because this reads like an XY Problem. Commented May 21, 2023 at 18:07
  • Is there a reason you want to do this via a descriptor, rather than writing a proper data structure class that is a dict while printing like a set (or vice versa)? The __str__ method needs to be on the thing returned by any descriptor, which might make having the descriptor a waste of time. Commented May 21, 2023 at 18:31
  • @Blckknght it's possible that all I need to do is create a custom data structure rather than use Descriptors. Your feedback is helpful in that sense. I guess I'm struggling to find a good resource on how to write a proper data structure. Commented May 22, 2023 at 16:44
  • @Blckknght and what do you mean by "The __str__ method needs to be on the thing returned by any descriptor"? "On the thing"? Can you please refer explicitly to the example I wrote (or provide a better example)? Commented May 22, 2023 at 16:45
  • @yetixhunting You have it backwards, I'm afraid. You should provide a clearer example of what exactly you want. :) Commented May 22, 2023 at 17:04

1 Answer 1

0

If all you want is the behavior "assign set, but get dict", I am not sure you need to deal with descriptors at all.

Seems like a simple property would do just fine:

class Foo:
    _my_set: set[str]

    @property
    def my_dict(self) -> dict[str, list[str]]:
        return {f"key_{i}": [value] for i, value in enumerate(self._my_set)}

    @my_dict.setter
    def my_dict(self, value: set[str]) -> None:
        self._my_set = value


foo = Foo()
foo.my_dict = {'a', 'b', 'c'}
print(f"{foo.my_dict}")  # {'key_0': ['a'], 'key_1': ['c'], 'key_2': ['b']}

Update

If you want something that behave like a standard collection class (e.g. a set), a good starting point is usually the collections.abc module.

For example, you could subclass MutableSet, implement its abstract methods (__contains__, __iter__, __len__, add, and discard), and also implement your own __init__ and __str__ methods for it:

from collections.abc import Iterable, Iterator, MutableSet
from typing import TypeVar

T = TypeVar("T")


class SetButAlsoDictOfLists(MutableSet[T]):
    _data: dict[str, list[T]]

    def __init__(self, values: Iterable[T] = ()) -> None:
        self._data = {}
        for value in values:
            self.add(value)

    def __str__(self) -> str:
        return str(self._data)

    def __contains__(self, value: object) -> bool:
        return any(value in list_ for list_ in self._data.values())

    def __iter__(self) -> Iterator[T]:
        return (list_[0] for list_ in self._data.values())

    def __len__(self) -> int:
        return len(self._data)

    def add(self, value: T) -> None:
        self._data[f"key_{value}"] = [value]

    def discard(self, value: T) -> None:
        del self._data[f"key_{value}"]

As you wished, the underlying data structure is a dictionary of lists. I just implemented some arbitrary rule for creating the dictionary keys here for demonstration purposes.

As @Blckknght pointed out in a comment, the fact that you are using a different data structure underneath means that the runtime of operations can be very different. Specifically, as you can see, the way I implemented __contains__ here is in O(n) as opposed to O(1) with actual sets. This is because I am looping over the entire values view of the dict to find some value instead of just hashing and looking up as I would with a set.

On the other hand, even though deletion in principle would be just as expensive, due to this specific implementation of the dict keys logic, removal (discard) is just as efficient because the value is part of the key.

You could of course store the values in an actual set alongside the dictionary, thus making these operations efficient again, but this would obviously take up twice as much memory for each value.

Either way, you can use this class as a regular (mutable) set now, but its string representation is that of the underlying dictionary:

obj = SetButAlsoDictOfLists({"a", "b", "d"})
print(obj.isdisjoint(["x", "y"]))  # True
obj.add("c")
obj.remove("d")
print(obj)  # {'key_b': ['b'], 'key_a': ['a'], 'key_c': ['c']}

Now if you still want that descriptor magic for some reason, you can just write one that uses such a class under the hood, i.e. initializes a new object in its __set__ and returns it in its __get__ methods:

from typing import Generic, TypeVar

# ... import SetButAlsoDictOfLists

_T = TypeVar("_T")


class Descriptor(Generic[_T]):
    name: str

    def __set_name__(self, owner: type, name: str) -> None:
        self.name = name

    def __get__(
        self,
        instance: object,
        owner: type | None = None,
    ) -> SetButAlsoDictOfLists[_T]:
        return instance.__dict__.get(self.name, SetButAlsoDictOfLists())

    def __set__(self, instance: object, value: Iterable[_T]) -> None:
        instance.__dict__[self.name] = SetButAlsoDictOfLists(value)

And use it like this:

class Foo:
    my_cool_set = Descriptor[str]()


foo = Foo()
print(foo.my_cool_set)  # {}
foo.my_cool_set = {"a", "b"}
print(foo.my_cool_set)  # {'key_b': ['b'], 'key_a': ['a']}
foo.my_cool_set |= ["b", "c"]
print(foo.my_cool_set)  # {'key_b': ['b'], 'key_a': ['a'], 'key_c': ['c']}
Sign up to request clarification or add additional context in comments.

3 Comments

I don't want to get dict. I want to (1) get set, (2) assign set, (3) print as dict, with custom formatting. That's why I thought I needed to implement my own __str__ somehow.
And I want the actual data structure object in the background to be a dict. Does this make sense (even if sounds unwise)? Is it possible with a custom data structure?
@yetixhunting I updated my answer. Maybe this will point you in the right direction.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.