Skip to content

Refactoring dtypes (and constants) #8

Description

@nicola-bastianello

dtypes

I currently implemented a hacky Array.dtype in #7 , which matches dtypes by string name. this is not robust. also, the dtype handling doesn't follow the same abstract contract approach as the backend, since the dtype conversion is just a dict, which is not mandatory

I'm thinking of something like this:

  • we add abstract properties to the _Backend object, one for each of the dtypes supported by the codebase. something like the code below. each property needs to return the framework-native object representing float64 (np.dtype(np.float64) for np and jnp, torch.float64, tf.float64), or None if it is not supported (it might also be possible to have the return be contextual on what device is selected, e.g. if float64 is not supported on GPU)
@property
@abstractmethod
def float64(self) -> Any:
  • we then define a dtype object which, like Array, holds a framework-native dtype object in a slot/attribute named e.g. _dtype (an instance of numpy.dtype, torch.dtype, tf.DType), and exposes other useful methods, like __str__ returning "float64", and __eq__ to check if it is equal to another iop.dtype (by checking their _dtype attributes), and also allowing checks against framework-native objects
  • the dtype should allow for init by passing a framework-native dtype (which is then used as the _dtype attribute) or a string (in which case the matching backend attribute is used)
  • the dtype objects should be instantiated either at import of iop or registration of the backend, calling the corresponding backend property and erroring out if the backend is not defined. I'm not sure if creating them at import is doable

notes:

  • numpy (and hence jax) defines np.float64 which is a type and np.dtype(np.float64) which is a dtype object. the latter is what we should use
  • the naming convention seems to favor dtype for objects, only tf uses DType; so we can go for dtype I think
  • see array API also

Array.dtype

as I mentioned above, the current dtype implementation is quite hacky. adding the features above, we would then replace this by a property that returns a iop.dtype object created by 1) taking the Array.value dtype and 2) using it to instantiate a iop.dtype() object

a related discussion here is: do we want to make Array.dtype a slot, or keep the current approach of instantiating the dtype only on demand? I currently prefer the latter option as I don't see a huge benefit and adding another slot might decrease performance slightly

constants

a similar approach should be taken for constants inf, e, pi, nan (see here): we define an abstract property in _Backend that should return the framework-native corresponding constant. then we have 4 custom objects (iop.inf etc) that hold the framework-native constant as an attribute, and expose methods that allow for all operations one can do on these constants

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestquestionFurther information is requested

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions