Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal representation of MultiDim vectors is too general #91

Open
emwap opened this issue Mar 19, 2014 · 10 comments
Open

Internal representation of MultiDim vectors is too general #91

emwap opened this issue Mar 19, 2014 · 10 comments

Comments

@emwap
Copy link
Member

emwap commented Mar 19, 2014

When a MultiDim vector is converted to its internal representation, type information is lost.

Internal (Pull DIM2) a

becomes

([Length],[Internal a])

The number of dimensions is converted into a runtime property and the information is lost at the type level.

Among other things, this makes it impossible to write a good Arbitrary instance that will generate only arrays with the correct number of dimensions.

Can we make the internal representation an explicit (new)type instead of the pair and encode the dimensions so that the number of dimensions is still available in the type.

cc @josefs

@emilaxelsson
Copy link
Member

How about (untested):

type family InternalShape sh
type instance InternalShape Z = ()
type instance InternalShape (sh :. l) = (InternalShape sh, Internal l)

instance (Syntax a, Shapely sh) => Syntactic (Pull sh a)
  where
    type Internal (Pull sh a) = (InternalShape sh, [Internal a])
    ...

@emilaxelsson
Copy link
Member

The above solution would also lead to better code when we implement virtual tuples #61.

@emilaxelsson
Copy link
Member

We probably want a closed type family:

type family InternalShape sh where
    InternalShape Z = ()
    InternalShape (Z :. l) = Internal l
    InternalShape (sh :. l) = (InternalShape sh, Internal l)

This would avoid the () for arrays with more than 0 dimensions.

@pjonsson
Copy link
Member

Nesting any type of non-scalar types risks hitting Feldspar/feldspar-compiler#145.

A couple of words of general caution too:

*) I know that MultiDim triggered some CSE-oddities because of the choice of the representation, so talk to @josefs about that before changing things around for this.

*) Closed type families are a GHC 7.8 feature. GHC 7.8 isn't released yet. Once there is a release there is no Haskell Platform. After GHC is released there is the usual shakedown of packages that don't work because of some changes, or dependencies who aren't updated, etc.

I'm all for leveraging the and cool stuff in GHC 7.8, but please do so after there is a Haskell platform and things have settled a bit on Hackage.

@emilaxelsson
Copy link
Member

Nesting any type of non-scalar types risks hitting Feldspar/feldspar-compiler#145.

Yes, but only until we fix Feldspar/feldspar-compiler#3 :-)

I agree we should only use features that are in the Platform. Without closed type families, we would get ((), Data Length) instead of Data Length for a 1D vector. I think this will only lead to an extra useless assignment, which we can probably live with until closed type families are de facto.

@josefs
Copy link
Contributor

josefs commented Mar 20, 2014

It seems that everyone agrees that the current representation of dimensions in the vector library is something we would like to move away from. I agree with Emil that we should aim for the tuple representation, but I expect that with the compiler we have now it would generate absolutely horrible code because of the way tuples are currently compiled.

It seems that we have two ways of dealing with dimensions in multidim:

  • Implement virtual tuples and represent dimensions as nested tuples
  • Do what Anders suggest and use some form of newtype as an interim solution.

I don't understand exactly how Anders' solution would work. Ander, can you elaborate? I also have no idea how hard it would be to implement virtual tuples so I find it difficult to say which course of action would be easiest.

@emilaxelsson
Copy link
Member

It seems we can do without closed type families after all:

type family InternalShape sh
type instance InternalShape Z = ()
type instance InternalShape (Z :. l) = Internal l
type instance InternalShape (sh :. l1 :. l2) = (InternalShape (sh :. l1), Internal l)

This should be equivalent to the closed version above (though I haven't tested).

@emilaxelsson
Copy link
Member

Josef wrote:

I also have no idea how hard it would be to implement virtual tuples so I find it difficult to say which course of action would be easiest.

The main problem with virtual tuples is that they play badly with the copy-propagation-on-the-fly in fromCore. If the result location in fromCore would always be a variable, it would be easy to implement virtual tuples (for example, selectN reduces to appending _N to the variable name).

But the only case where the result location is something different than a variable is when the program computes an array or a struct (I think). If we can guarantee that tuples are never stored in arrays or structs (as the virtual tuples proposal suggests), then copy propagation might not be a problem after all. So it might be easy to implement virtual tuples...

@pjonsson
Copy link
Member

Halfway through on this issue: Anders has working code based on the open type family sketched above for going from DIMk to a (k+1)-tuple. Code for going from the tuple to DIMk is missing.

@emilaxelsson
Copy link
Member

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants