C ++ and its type system: how to work with data with several types?

"Introduction"

I am relatively new to C ++. I went through all the basic things and managed to build 2-3 simple translators for my programming languages.

The first thing that gave and still gives me a headache: Implementing the type system of my language in C ++

Think about it: Ruby, Python, PHP and Co. have many built-in types, which are obviously implemented in C. So I first tried to make it possible to give a value in my language of three possible types: Int, String and Nil.

I came up with this:

enum ValueType { Int, String, Nil }; class Value { public: ValueType type; int intVal; string stringVal; }; 

Yes, wow, I know. It was very slow to go through this class, because you had to call the row allocator all the time.

The next time I tried something like this:

 enum ValueType { Int, String, Nil }; extern string stringTable[255]; class Value { public: ValueType type; int index; }; 

I would save all the lines in stringTable and write their position in index . If the Value type was Int , I just stored the integer in index , it would not make sense at all using the int index to access another int or?

In any case, the above said gave me a headache too. After some time, access to the row from the table here, a link to it there and its copying there turned over over my head - I lost control. I had to impose a translator.

Now: Good, so C and C ++ are statically typed.

  • How do the main implementations of the above languages ​​process different types in their programs (fixnums, bignums, nums, string, array, resources, ...)?

  • What to do to get maximum speed with many different types available?

  • How do solutions compare to my simplified versions above?

+7
c ++ interpreter typing
source share
5 answers

There are several different things you can do here. Various solutions arose on time, and most of them require dynamic allocation of the actual database (boost :: variant can avoid the use of dynamically allocated memory for small objects --thanks @MSalters).

Pure C:

Store type information and a void pointer in memory, which should be interpreted according to type information (usually this is an enumeration):

 enum type_t { integer, string, null }; typedef struct variable { type_t type; void * datum; } variable_t; void init_int_variable( variable_t * var, int value ) { var->type = integer;   var->datum = malloc( sizeof(int) ); *((int)var->datum) = value; } void fini_variable( variable_t var ) // optionally by pointer { free( var.datum ); } 

In C ++, you can improve this approach by using classes to simplify use, but more importantly, you can use more complex solutions and use existing libraries like boost :: any or boost :: variant, which offer different solutions for one and the same problem.

Both boost :: any and boost :: variant store values ​​in dynamically allocated memory, usually with a pointer to a virtual class in the hierarchy, and with operators that reinterpret (down casts) to specific types.

+4
source share

One obvious solution is to define a type hierarchy:

 class Type { }; class Int : public Type { }; class String : public Type { }; 

etc. As a complete example, we will write an interpreter for a tiny language. The language allows you to declare the following variables:

 var a 10 

This will create an Int object, set it to 10 and save it in the variable table under the name a . Operations can be called on variables. For example, the operation of adding two Int values ​​looks like this:

 + ab 

Here is the complete interpreter code:

 #include <iostream> #include <string> #include <vector> #include <sstream> #include <cstdlib> #include <map> // The base Type object from which all data types are derived. class Type { public: typedef std::vector<Type*> TypeVector; virtual ~Type () { } // Some functions that you may want all types of objects to support: // Returns the string representation of the object. virtual const std::string toString () const = 0; // Returns true if other_obj is the same as this. virtual bool equals (const Type &other_obj) = 0; // Invokes an operation on this object with the objects in args // as arguments. virtual Type* invoke (const std::string &opr, const TypeVector &args) = 0; }; // An implementation of Type to represent an integer. The C++ int is // used to actually store the value. As a consequence this type is // machine dependent, which might not be what you want for a real // high-level language. class Int : public Type { public: Int () : value_ (0), ret_ (NULL) { } Int (int v) : value_ (v), ret_ (NULL) { } Int (const std::string &v) : value_ (atoi (v.c_str ())), ret_ (NULL) { } virtual ~Int () { delete ret_; } virtual const std::string toString () const { std::ostringstream out; out << value_; return out.str (); } virtual bool equals (const Type &other_obj) { if (&other_obj == this) return true; try { const Int &i = dynamic_cast<const Int&> (other_obj); return value_ == i.value_; } catch (std::bad_cast ex) { return false; } } // As of now, Int supports only addition, represented by '+'. virtual Type* invoke (const std::string &opr, const TypeVector &args) { if (opr == "+") { return add (args); } return NULL; } private: Type* add (const TypeVector &args) { if (ret_ == NULL) ret_ = new Int; Int *i = dynamic_cast<Int*> (ret_); Int *arg = dynamic_cast<Int*> (args[0]); i->value_ = value_ + arg->value_; return ret_; } int value_; Type *ret_; }; // We use std::map as a symbol (or variable) table. typedef std::map<std::string, Type*> VarsTable; typedef std::vector<std::string> Tokens; // A simple tokenizer for our language. Takes a line and // tokenizes it based on whitespaces. static void tokenize (const std::string &line, Tokens &tokens) { std::istringstream in (line, std::istringstream::in); while (!in.eof ()) { std::string token; in >> token; tokens.push_back (token); } } // Maps varName to an Int object in the symbol table. To support // other Types, we need a more complex interpreter that actually infers // the type of object by looking at the format of value. static void setVar (const std::string &varName, const std::string &value, VarsTable &vars) { Type *t = new Int (value); vars[varName] = t; } // Returns a previously mapped value from the symbol table. static Type * getVar (const std::string &varName, const VarsTable &vars) { VarsTable::const_iterator iter = vars.find (varName); if (iter == vars.end ()) { std::cout << "Variable " << varName << " not found." << std::endl; return NULL; } return const_cast<Type*> (iter->second); } // Invokes opr on the object mapped to the name var01. // opr should represent a binary operation. var02 will // be pushed to the args vector. The string represenation of // the result is printed to the console. static void invoke (const std::string &opr, const std::string &var01, const std::string &var02, const VarsTable &vars) { Type::TypeVector args; Type *arg01 = getVar (var01, vars); if (arg01 == NULL) return; Type *arg02 = getVar (var02, vars); if (arg02 == NULL) return; args.push_back (arg02); Type *ret = NULL; if ((ret = arg01->invoke (opr, args)) != NULL) std::cout << "=> " << ret->toString () << std::endl; else std::cout << "Failed to invoke " << opr << " on " << var01 << std::endl; } // A simple REPL for our language. Type 'quit' to exit // the loop. int main (int argc, char **argv) { VarsTable vars; std::string line; while (std::getline (std::cin, line)) { if (line == "quit") break; else { Tokens tokens; tokenize (line, tokens); if (tokens.size () != 3) { std::cout << "Invalid expression." << std::endl; continue; } if (tokens[0] == "var") setVar (tokens[1], tokens[2], vars); else invoke (tokens[0], tokens[1], tokens[2], vars); } } return 0; } 

Exemplary interaction with the interpreter:

 /home/me $ ./mylang var a 10 var b 20 + ab 30 + ac Variable c not found. quit 
+4
source share

Regarding speed, you say:

It was very slow to go around this class as a line dispenser should have been called all the time.

Do you know that you must pass objects by reference the vast majority of the time? Your solution looks workable for a simple interpreter.

+1
source share

C ++ is a strongly typed language. I see that you come from an unprintable language and still think in these terms.

If you really need to save multiple types in a variable, look boost :: any .

However, if you use an interpreter, you must use inheritance and classes that represent a particular type.

+1
source share

In accordance with the Vijay decision, the implementation will be:

 Type* array; // to initialize the array array = new Type(size_of_array); // when you want to add values array[0] = new Int(42); // to add another string value array[1] = new String("fourty two"); 

A bit missing in his code is HOW to extract these values ​​... Here is my version (in fact, I found out from Ogre and changed it to my liking).

Usage is something like:

 Any array[4]; // Automatically understands it an integer array[0] = Any(1); // But let say you want the number to be thought of as float array[1] = Any<float>(2); // What about string? array[2] = Any<std::string>("fourty two"); // Note that this gets the compiler thinking it a char* // instead of std::string array[3] = Any("Sometimes it just turns out to be what you don't want!"); 

Ok now to determine if a particular element is a string:

 if(array[2].isType<std::string>() { // Extract the string value. std::string val = array[2].cast<std::string>(); // Make the string do your bidding!!!... /evilgrin // WAIT! But what if you want to directly manipulate // the value in the array? std::string& val1 = array[2].cast<std::string>(); // HOHOHO... now any changes to val1 affects the value // in the array ;) } 

The code for the Any class is given below. Feel free to use it as you like :). Hope this helps!

In the header file ... say Any.h

  #include <typeinfo> #include <exception> /* * \class Any * \brief A variant type to hold any type of value. * \detail This class can be used to store values whose types are not * known before hand, like to store user-data. */ class Any { public: /*! * \brief Default constructor. */ Any(void); /*! * \brief Constructor that accepts a default user-defined value. * \detail This constructor copies that user-defined value into a * place holder. This constructor is explicit to avoid the compiler * to call this constructor implicitly when the user didn't want * the conversion to happen. * \param val const reference to the value to be stored. */ template <typename ValueType> explicit Any(const ValueType& val); /*! * \brief Copy constructor. * \param other The \c Any variable to be copied into this. */ Any(const Any& other); /*! * \brief Destructor, does nothing other than destroying the place holder. */ ~Any(void); /*! * \brief Gets the type of the value stored by this class. * \detail This function uses typeid operator to determine the type * of the value it stores. * \remarks If the place holder is empty it will return Touchscape::VOID_TYPE. * It is wise to check if this is empty by using the function Any::isEmpty(). */ const std::type_info& getType() const; /*! * \brief Function to verify type of the stored value. * \detail This function can be used to verify the type of the stored value. * Usage: * \code * int i; * Touchscape::Any int_any(i); * // Later in your code... * if (int_any.isType<int>()) * { * // Do something with int_any. * } * \endcode * \return \c true if the type matches, false otherwise. */ template <typename T> bool isType() const; /*! * \brief Checks if the type stored can be converted 'dynamically' * to the requested type. * \detail This would useful when the type stored is a base class * and you would like to verify if it can be converted to type * the user wants. * Example: * \code * class Base * { * // class implementation. * }; * class Derived : public Base * { * // class implementation. * }; * * // In your implementation function. * { * //... * // Somewhere in your code. * Base* a = new Derived(); * Touchscape::Any user_data(a); * my_object.setUserData(user_data); * // Then when you need to know the user-data type * if(my_object.getUserData().isDynamicType<Derived>()) * { * // Do something with the user data * } * } * \endcode * \return \c true if the value stored can be dynamically casted to the target type. * \deprecated This function will be removed and/or changed in the future. */ template <typename T> bool isDynamicType() const; /*! * \brief Convert the value stored to the required type. * \detail This function is used just like a static-cast to retrieve * the stored value. * \return A reference to the stored value. * \warning This function will throw std::bad_cast exception if it * finds the target type to be incorrect. */ template <typename T> T& cast(); /*! * \brief Convert the value stored to the required type (const version). * \detail This function is used just like static_cast to retrieve * the stored value. * \return A \c const reference to the stored value. * \warning This function will throw std::bad_cast exception if it * finds the target type to be incorrect. */ template <typename T> const T& cast() const; /*! * \brief Dynamically converts the stored value to the target type * \detail This function is just like dynamic_cast to retrieve * the stored value to the target type. * \return A reference to the stored value. * \warning This function will throw std::bad_cast exception if it * finds that the value cannot be dynamically converted to the target type. * \deprecated This function will be removed and/or changed in the future. */ template <typename T> T& dynamicCast(); /*! * \brief Dynamically converts the stored value to the target type (const version) * \detail This function is just like dynamic_cast to retrieve * the stored value to the target type. * \return A const reference to the stored value. * \warning This function will throw std::bad_cast exception if it * finds that the value cannot be dynamically converted to the target type. * \deprecated This function will be removed and/or changed in the future. */ template <typename T> const T& dynamicCast() const; /*! * \brief Swaps the contents with another \c Any variable. * \return reference to this instance. */ Any& swap(Any& other); /*! * \brief Checks if the place holder is empty. * \return \c true if the the place holder is empty, \c false otherwise. */ bool isEmpty() const; /*! * \brief Checks if the place holder is \b not empty. * \return \c true if the the place holder is not empty, \c false otherwise. * \remarks This is just a lazy programmer attempt to make the code look elegant. */ bool isNotEmpty() const; /*! * \brief Assignment operator * \detail Assigns a 'raw' value to this instance. * \return Reference to this instance after assignment. */ template <typename ValueType> Any& operator = (const ValueType& rhs); /*! * \brief Default assignment operator * \detail Assigns another \c Any type to this one. * \return Reference to this instance after assignment. */ Any& operator = (const Any& rhs); /*! * \brief Boolean equality operator */ bool operator == (const Any& other) const; /*! * \brief Boolean equality operator that accepts a 'raw' type. */ template<typename ValueType> bool operator == (const ValueType& other) const; /*! * \brief Boolean inequality operator */ bool operator != (const Any& other) const; /*! * \brief Boolean inequality operator that accepts a 'raw' type. */ template<typename ValueType> bool operator != (const ValueType& other) const; protected: /*! * \class PlaceHolder * \brief The place holder base class * \detail The base class for the actual 'type'd class that stores * the value for T ouchscape::Any. */ class PlaceHolder { public: /*! * \brief Virtual destructor. */ virtual ~PlaceHolder(){} /*! * \brief Gets the \c type_info of the value stored. * \return (const std::type_info&) The typeid of the value stored. */ virtual const std::type_info& getType() const = 0; /*! * \brief Clones this instance. * \return (PlaceHolder*) Cloned instance. */ virtual PlaceHolder* clone() const = 0; }; /*! * \class PlaceHolderImpl * \brief The class that ultimately keeps hold of the value stored * in Touchscape::Any. */ template <typename ValueType> class PlaceHolderImpl : public PlaceHolder { public: /*! * \brief The only constructor allowed. * \param val The value to store. */ PlaceHolderImpl(const ValueType& val) :m_value(val){} /*! * \brief The destructor. * \detail Does nothing */ ~PlaceHolderImpl(){} /*! * \copydoc Touchscape::PlaceHolder::getType() */ const std::type_info& getType() const { return typeid(ValueType); } /*! * \copydoc Touchscape::PlaceHolder::clone() */ PlaceHolder* clone() const { return new PlaceHolderImpl<ValueType>(m_value); } ValueType m_value; }; PlaceHolder* m_content; }; /************************************************************************/ /* Template code implementation section */ /************************************************************************/ template <typename ValueType> Any::Any(const ValueType& val) :m_content(new PlaceHolderImpl<ValueType>(val)) { } //--------------------------------------------------------------------- template <typename T> bool Any::isType() const { bool result = m_content?m_content->getType() == typeid(T):false; return result; } //--------------------------------------------------------------------- template <typename T> bool Any::isDynamicType() const { bool result = m_content ?dynamic_cast<T>(static_cast<PlaceHolderImpl<T>*>(m_content)->m_value)!=NULL :false; return result; } //--------------------------------------------------------------------- template <typename T> T& Any::cast() { if (getType() != VOID_TYPE && isType<T>()) { T& result = static_cast<PlaceHolderImpl<T>*>(m_content)->m_value; return result; } StringStream ss; ss<<"Cannot convert '"<<getType().name()<<"' to '"<<typeid(T).name()<<"'. Did you mean to use dynamicCast() to cast to a different type?"; throw std::bad_cast(ss.str().c_str()); } //--------------------------------------------------------------------- template <typename T> const T& Any::cast() const { Any& _this = const_cast<Any&>(*this); return _this.cast<T>(); } //--------------------------------------------------------------------- template <typename T> T& Any::dynamicCast() { T* result = dynamic_cast<T>(static_cast<PlaceHolderImpl<T>*>(m_content)->m_value); if (result == NULL) { StringStream ss; ss<<"Cannot convert '"<<getType().name()<<"' to '"<<typeid(T)<<"'."; throw std::bad_cast(ss.str().c_str()); } return *result; } //--------------------------------------------------------------------- template <typename T> const T& Any::dynamicCast() const { Any& _this = const_cast<Any&>(*this); return _this.dynamicCast<T>(); } //--------------------------------------------------------------------- template <typename ValueType> Any& Any::operator = (const ValueType& rhs) { Any(rhs).swap(*this); return *this; } //--------------------------------------------------------------------- template <typename ValueType> bool Any::operator == (const ValueType& rhs) const { bool result = m_content == rhs; return result; } //--------------------------------------------------------------------- template <typename ValueType> bool Any::operator != (const ValueType& rhs) const { bool result = m_content != rhs; return result; } 

Now in the CPP file ... Any.cpp

 #include "Any.h" static const std::type_info& VOID_TYPE(typeid(void)); Any::Any( void ) :m_content(NULL) { } //--------------------------------------------------------------------- Any::Any( const Any& other ) :m_content(other.m_content?other.m_content->clone():NULL) { } //--------------------------------------------------------------------- Any::~Any( void ) { SafeDelete(m_content); } //--------------------------------------------------------------------- const std::type_info& Any::getType() const { return m_content?m_content->getType():VOID_TYPE; } //--------------------------------------------------------------------- Any& Any::swap( Any& other ) { std::swap(m_content, other.m_content); return *this; } //--------------------------------------------------------------------- Any& Any::operator=( const Any& rhs ) { Any(rhs).swap(*this); return *this; } //--------------------------------------------------------------------- bool Any::isEmpty() const { bool is_empty = m_content == NULL; return is_empty; } //--------------------------------------------------------------------- bool Any::isNotEmpty() const { bool is_not_empty = m_content != NULL; return is_not_empty; } //--------------------------------------------------------------------- bool Any::operator==( const Any& other ) const { bool result = m_content == other.m_content; return result; } //--------------------------------------------------------------------- bool Any::operator!=( const Any& other ) const { bool result = m_content != other.m_content; return result; } 
0
source share