OCaml modules (parameterized modules) emulations in Haskell

Is there any recommended way to use type classes to emulate OCAMl-like parameterized modules?

For an instance, I need a module that implements a complex general computation, which can be parmetrized by various different types, functions, etc. To be more specific, let it be kMeans, which can be parameterized using various types of values, types of vectors (list, unpacked vector, vector, tuple, etc.), and distance calculation strategies.

For convenience, to avoid a crazy amount of intermediate types, I want to have this computational polymorphism in the DataSet class, which contains all the required interfaces. I also tried using TypeFamilies to avoid many typeclass parameters (which also cause problems):

{-# Language MultiParamTypeClasses
           , TypeFamilies
           , FlexibleContexts
           , FlexibleInstances
           , EmptyDataDecls
           , FunctionalDependencies
           #-}

module Main where

import qualified Data.List as L
import qualified Data.Vector as V
import qualified Data.Vector.Unboxed as U

import Distances
-- contains instances for Euclid distance
-- import Distances.Euclid as E
-- contains instances for Kulback-Leibler "distance"
-- import Distances.Kullback as K

class ( Num (Elem c)
     ,  Ord (TLabel c)
     ,  WithDistance (TVect c) (Elem c)
     ,  WithDistance  (TBoxType c) (Elem c)
     ) 
     => DataSet c  where
    type Elem c  ::  *
    type TLabel c ::  *
    type TVect c :: * -> *
    data TDistType  c :: *
    data TObservation c :: *
    data TBoxType c :: * -> *
    observations :: c -> [TObservation c]
    measurements :: TObservation c -> [Elem c]
    label        :: TObservation c -> TLabel c
    distance :: TBoxType c (Elem c) -> TBoxType c (Elem c) -> Elem c
    distance = distance_

instance DataSet () where
  type Elem () = Float
  type TLabel () = Int
  data TObservation () = TObservationUnit [Float]
  data TDistType ()
  type TVect () = V.Vector 
  data TBoxType () v = VectorBox (V.Vector v)
  observations () = replicate 10 (TObservationUnit [0,0,0,0])
  measurements (TObservationUnit xs) = xs
  label (TObservationUnit _) = 111 

kMeans :: ( Floating (Elem c)
          , DataSet c
          ) => c
            -> [TObservation c]
kMeans s = undefined -- here the implementation
  where
    labels = map label (observations s)
    www  = L.map (V.fromList.measurements) (observations s)
    zzz  = L.zipWith distance_ www www
    wtf1 = L.foldl wtf2 0 (observations s)
    wtf2 acc xs = acc + L.sum (measurements xs)
    qq = V.fromList [1,2,3 :: Float]
    l = distance (VectorBox qq) (VectorBox qq)

instance Floating a => WithDistance (TBoxType ()) a where
  distance_ xs ys = undefined

instance Floating a => WithDistance V.Vector a where
  distance_  xs ys = sqrt $ V.sum (V.zipWith (\x y -> (x+y)**2) xs ys)

This code compiles and works somehow, but it's pretty ugly and hacky.

kMeans should be parameterized according to the type of value (number, swimming point number, whatever), (vector, list, unboxed vector, tuple can be) and distance calculation strategy.

There are also types for observation (that the type of sample provided by the user, there must be many, measurements that are contained in each observation).

So the problems are:

1) If the function does not contain parametric types in it, the types will not be inferred

2) - , typeclass WithDistance, (Euclid, Kullback, phantom).

WithDistance , , , . , ?

OCaml is't. Haskell?

Typeclasses TypeFamilies - , . - .

+4
1

Haskell , * ML.
Haskell: http://plv.mpi-sws.org/backpack/

, ML. God class -, .

, . . DataSet () type TVect () = V.Vector, , TVect = U.Vector.

kMeans, , .

. :

kMeans :: Int -> [(Double,Double)] -> [[(Double,Double)]]
kMeans k points = ...

:

kMeans
   :: Int
   -> ((Double,Double) -> (Double,Double) -> Double)
   -> [(Double,Double)]
   -> [[(Double,Double)]]
kMeans k distance points = ...

, , , , . :

kMeans
    :: Point p
    => Int -> (p -> p -> Coord p) -> [p]
    -> [[p]]
kMeans k distance points = ...

class Num (Coord p) => Point p where
    type Coord p
    coords :: p -> [Coord p]

euclidianDistance
    :: (Point p, Floating (Coord p))
    => p -> p -> Coord p
euclidianDistance a b
    = sum $ map (**2) $ zipWith (-) (coords a) (coords b)

, :

kMeans
    :: (Point p, DataSet vec p)
    => Int -> (p -> p -> Coord p) -> vec p
    -> [vec p]
kMeans k distance points = ...

class DataSet vec p where
  map :: ...
  foldl' :: ...

instance Unbox p => DataSet U.Vector p where
  map = U.map
  foldl' = U.foldl'

.

, ( ).
- .

+3

All Articles