What is () in Haskell, exactly?

Question

What is () in Haskell, exactly?

I read "Learn You a Haskell", and in the chapters of the monad it seems to me that () seen as a kind of "zero" for each type. When I check type () in GHCi, I get

 >> :t () () :: ()

which is an extremely confusing statement. It seems that () is a type for itself. I am confused by how it fits into the language, and how it seems to be able to stand on any type.

+53

types haskell unit-type

MYV Jun 03 '13 at 8:20

source share

6 answers

Type () can be considered as a tuple of the zero element. This is a type that can have only one value, and therefore it is used where you need to have a type, but you actually do not need to pass any information. Here are a few uses for this.

Monadic things like IO and State have a return value and also have side effects. Sometimes the only point of the operation is the execution of a side effect, for example, recording on the screen or saving some state. To write to the screen, putStrLn must be of type String -> IO ? - IO should always have some type of return value, but there is nothing useful for return. What type should we return? We could say Int and always return 0, but this is misleading. Therefore, we return () , a type that has only one value (and therefore useful information) to indicate that nothing useful is returned.

It is sometimes useful to have a type that cannot have useful values. Think of a type of Map kv that maps keys of type k values of type v . Then you want to implement a Set that really looks like a map, except that you don't need part of the value, just keys. In a language such as Java, you can use a boolean as the value type of a dummy type, but in fact you just need a type that does not have useful values. So you could say type Set k = Map k ()

It should be noted that () not particularly magic. If you want, you can save it in a variable and do a pattern matching (although not many):

 main = do x <- putStrLn "Hello" case x of () -> putStrLn "The only value..."

+28

Neil Brown Jun 03 '13 at 8:47

source share

It is called the Unit type, commonly used to represent side effects. You may think of it vaguely like Void in Java. More here and here , etc. What can be confusing is that () syntactically represents both a type and its only value literal. Also note that it doesn't look like null in Java, which means that the undefined - () reference is just a tuple of size 0.

+11

thSoft Jun 03 '13 at 8:24

source share

I really like to think about () by analogy with tuples.

(Int, Char) is the type of all pairs a Int and a Char , so values are all possible values of Int that intersect with all possible values of Char . (Int, Char, String) similar to the type of all triples a Int , a Char and a String .

It's easy to see how to keep spreading this pattern up, but what about down?

(Int) would be a 1-tuple type consisting of all possible values of Int . But that would be parsed by Haskell, as we simply put parentheses around Int and thus are just an Int type. And the values in this type will be (1) , (2) , (3) , etc., which will also be simply parsed as regular Int values in parentheses. But if you think about it, then the "1-tuple" is exactly the same as just one value, so there is no need to actually exist.

Going one step further to zero tuples gives us () , which should be all possible combinations of values in an empty type list. Well, there is only one way to do this, which should not contain any other values, so there should only be one value in type () . And by analogy with the syntax of the value of a tuple, we can write this value as () , which, of course, looks like a tuple that does not contain values.

Here's how it works. There is no magic, and this type () and its value () are not processed by the language in any way.

() is not really considered a “null value for any type” in the monad examples in the LYAH book. Whenever type () , the only value that can be returned is () . Therefore, it is used as a type to explicitly say that no other return value can. Just as another type is supposed to be returned, you cannot return () .

It should be borne in mind that when a bunch of monadic calculations are added together with do blocks or operators like >>= , >> , etc., they will build a value of type ma for some monad m . This choice of m should remain unchanged in all components (there is no way to compose Maybe Int with IO Int this way), but a can and does very often differ at each stage.

So, when someone inserts IO () in the middle of calculating an IO String , without using () as a null in the String type, he simply uses IO () to create a IO String , just as you could use Int on ways to create a String .

+7

Ben Jun 04 '13 at 3:24

source share

The confusion comes from other programming languages: "void" means in most imperative languages that there is no structure in memory that retains meaning. This seems inconsistent because "boolean" has 2 values instead of 2 bits, while "void" has no bits instead of no values, but there we are talking about what the function returns in a practical sense. More precisely: its single value does not consume a bit of memory.

Let for a moment ignore the bottom of the value (written _|_ ) ...

() is called Unit, written as a zero set. It has only one meaning. And it is not called Void , because Void does not even have any value, therefore no function can be returned.

Please note: Bool has 2 values ( True and False ), () has one value ( () ) and Void has no value (it does not exist). They look like sets with two / one / without elements. The smallest memory required to store their value is 1 bit / no bit / impossible, respectively. This means that the function returning the value () may return with the result of the result (obvious), which may be useless to you. Void , on the other hand, implies that this function will never return and will never give you any result, because there will be no result.

If you want to assign "this value" a name returned by a function that never returns (yes, that sounds crazytalk), then name it bottom (" _|_ ", written as the inverse of T). It can be a cycle of exclusion or infinity, or a dead end or "just wait longer." (Some functions are returned only from below if one of their parameters is below.)

When you create a Cartesian product / tuple of these types, you will observe the same behavior: (Bool,Bool,Bool,(),()) has 2 · 2 · 2 · 1 · 1 = 6 different values. (Bool,Bool,Bool,(),Void) is similar to the set {t, f} × {t, f} × {t, f} × {u} × {}, which has 2 · 2 · 2 · 1 · 0 = 0 elements if you do not consider _|_ as the value.

+6

comonad Jun 18 '13 at 16:25

source share

Another angle:

() is the name of a collection containing one item with the name () .

What really confuses him a bit is that the name of the set and in this case the element in this case will be the same.

Remember: in Haskell, a type is a collection that has its possible values as elements in it.

+4

jhegedus Apr 18 '14 at 14:50

source share

pigworker · Accepted Answer · 2013-06-03 09:44

tl; dr () does not add a null value for each type, no damn thing; () is a "dumb" value in its own type: () .

Let me step back a little from the question and turn to a common source of confusion. The main thing to learn when learning Haskell is the difference between an expression language and its type. You probably know that these two people are separate. But this allows you to use the same symbol in both, and this is what happens here. There are simple text tips to tell you what language you are looking at. You do not need to analyze the entire language to detect these signals.

The top level of the Haskell module lives by default in the expression language. You define functions by writing equations between expressions. But when you see foo :: bar in the expression language, it means that foo is an expression and bar is its type. Therefore, when you read () :: () , you see an instruction that associates () in the expression language with () in the type language. Two characters () mean different things, because they are not in the same language. This repetition often causes confusion for beginners until the separation of the language of expression / type is established in their subconscious, and at that moment it becomes mnemonic.

The data keyword represents a new data type declaration that includes a thorough mix of expression languages and types, because it first tells you what the new type is, and secondly, what its meanings are.

 data TyCon tyvar ... tyvar = ValCon1 type ... type |  ... |  ValConn type ... type

In such a declaration, the TyCon type constructor is added to the type language, and ValCon value constructors are added to the expression language (and its pattern sublanguage). In the data declaration, things in the argument places for ValCon s tell you the types specified for the arguments when this ValCon is used in expressions. For example,

 data Tree a = Leaf | Node (Tree a) a (Tree a)

declares a Tree type constructor for binary tree types that store items in nodes whose values are specified by the Leaf and Node value constructors. I like to use the constructors of the color tree (Tree) blue and the constructors of the values (Leaf, Node) red. Expressions should not have blue ones, and (if you do not use advanced functions) there are no red types. You can declare a built-in type of Bool ,

 data Bool = True | False

adding blue Bool to the type of language, and red True and False to the language of the expression. Unfortunately, my markdown-fu is inadequate to the task of adding colors to this post, so you just need to learn how to add colors to the head.

The unit type uses () as a special character, but it works as if declared

 data () = () -- the left () is blue; the right () is red

means that the synonym () is a type constructor in the type language, but conceptually red () is a value constructor in the expression language and really () :: () . [This is not the only example of such a pun. Types of large tuples follow the same pattern: the pair syntax is as if given

 data (a, b) = (a, b)

adding (,) to type and expression languages. But I'm distracted.

Thus, type () , often pronounced “Unit”, is a type containing one value that is worth mentioning: this value is written () , but in the expression language it is sometimes pronounced “void”. A single value type is not very interesting. A value of type () introduces zero bits of information: you already know what it should be. So, while there is no special type () to indicate side effects, it is often displayed as a component of a value in a monadic type. Monadic operations usually have types that look like

  val-in-type-1 -> ... -> val-in-type-n -> effect-monad val-out-type

where the return type is the type application: the function tells you what effects are possible, and the argument tells you what value is generated by the operation. for example

 put :: s -> State s ()

which is read (because the application is associated with the left ["as we did in the sixties," Roger Hindley]) as

 put :: s -> (State s) ()

has one type of input of the value s , the effect monad State s and type of output of the value () . When you see () as the type of output of the value, it means that "this operation is used only for its effect, while the value is uninteresting." Similarly

 putStr :: String -> IO ()

passes a string to stdout but returns nothing interesting.

Type () also useful as an element type for container-like structures, where it indicates that the data consists only of a form without any interesting payload. For example, if Tree declared above, then Tree () is a type of binary tree-like forms without storing anything interesting in nodes. Similarly, [()] is a type of lists of dim elements, and if there is nothing interesting in the elements of a list, then the only information that it contributes is its length.

To summarize, () is a type. Its one value, () , has the same name, but this is normal, because type and expression languages are separate. It is useful to have a type representing "no information" because in a context (such as a monad or container) it tells you that only the context is interesting.

What is () in Haskell, exactly?

More articles: