C++ field accessors - a cure for Public Field Phobia

C++ Standard

Recently I was looking for some opinions regarding lack of field accessors in C++ standard. I came across a nice article A Modest Proposal for Curing the Public Field Phobia dated 2001. Since then C++ has matured a lot gaining amazing new features with C++11 and latest C++14 revision, but still some old annoyances like no field accessors remain.

The most often used argument against providing field accessors in C++, it that the language was designed to be verbose and explicit, so it is better to have obj.setX(1) than obj.x = 1, that may let you believe you are accessing some memory directly, while you are not. This leads use however to very obscure code like that:

node.setVisitCount(node.getVisitCount() + 1);

These who are not suffering public field phobia may define visitCount to be public field and use node.visitCount++ . But if it is a dynamic property then C++ does not give you many choices that to use former obscure syntax.

Public Field Phobia

First lets separate what is actually a public field phobia from real shortcomings of C++ language.

Some C++ programmers tend to wrap every single class field with setters and getters and keep all class fields private by all means:

class Point {
  float x_, y_;

 public:
  float getX()        { return x_; }
  float setX(float x) { x_ = x; }
  float getY()        { return y_; }
  float setY(float y) { y_ = y; }
};

Honestly I think this is horrible practice and there is no excuse for doing that instead of:

struct Point {
  float x, y; // struct members are public by default
};

Read-only fields

Sometimes field value needs to be readonly. In many cases programmers forget that field can be declared const and remain public, which will prevent its modification:

class Node {
 public:
  Node(int tag) : tag(tag) { /* ... */ }

  const int tag;
};

But there is a tradeoff here, such field, here tag, may be only set once in constructor initializer list or you need to explicitly take reference and drop const to modify such field.

Non-trivial setter

But in case setter is non-trivial and rather that just writing new value to the memory does some extra calculations , we need to fall back to the ugly getters and setters, and that one is not caused by the phobia anymore:

class Node {
  /* ... */
  void setVisitCount(int count) {
    visitCount_ = count;
    updateNeighbors();
  }
  /* ... */
};

Dynamic properties

Finally in they are dynamic properties - stored nowhere, dynamically calculated upon access. Lets take following example:

class Vector {
 public:
  float x, y;

  Vector(float x, float y) : x(x), y(y) {}

  float getLength() { return sqrtf(x*x + y*y); }

  float setLength(float length) {
    float lengthRatio = length / getLength();
    x *= lengthRatio;
    y *= lengthRatio;
  }
};

int main() {
  Vector v(1, 1);
  v.setLength(v.getLength() * 2);
  assert(v.x == 2);
}

Do we need such verbosity?

Okay, lets get back to the main question: Why there are no accessors in C++ standard?. Language standard maintainers say we need such verbosity, to let programmer know that some operation will be direct memory access or can be more complicated. Moreover the language should not offer means to modify its behavior, because point.x would have different meanings depending whether accessor has been defined or not.

Of course you may wonder, why operator[] is allowed then. It is because doing obj[1] on instance of class not defining such operator would have no meaning anyway. So in such case do do not override language behavior.

But honestly this rule can be semi-broken using operator=, operator() or type cast operator that can lead programmer to think wrong what some part of the code does.

class Test {
 public:
  struct {
    operator=(int value)
  } field;
};

int main() {
  Test test;
  test.field = 1; // Is it really trivial direct field write?
                  // Nope, it does absolutely nothing!
}

Proposed field accessor operators

A Modest Proposal for Curing the Public Field Phobia proposes new operator for defining field access. As title states, it is really modest, it does that by gently introducing new operator.fieldname. The field we are defining accessor for does not need to even exist in the class, so it will work in all cases we want dynamic or read-only property.

Rewriting our Vector example using proposed language extension we get:

class Vector {
 public:
  float x, y;

  Vector(float x, float y) : x(x), y(y) {}

  float operator.length() { return sqrtf(x*x + y*y); }

  void operator.length(float length) {
    float lengthRatio = length / getLength();
    x *= lengthRatio;
    y *= lengthRatio;
  }
};

int main() {
  Vector v(1, 1);
  v.length *= 2;
  assert(v.x == 2);
}

Isn’t it easier to read and consume? Now some other example, non-trivial setter:

class Node {
  int visitCount;
  /* ... */
  void operator.visitCount(int count) {
    visitCount = count;
    updateNeighbors();
  }
  /* ... */
};

Taken that field access operator has priority over direct field access, we effectively disallow direct writes to the field outside of the class. Easy!

Now can anyone tell me how to lobby C++ committee to get this into the standard?

Property template accessors

Now some practical stuff: Maybe we can leverage existing C++ language template meta-programming powers to create such field accessors?

Simple accessors using nested struct

class Vector {
 public:
  float x, y;
  /* ... */
  struct {
    operator float() { return sqrtf(x*x + y*y); }

    void operator=(float length) {
      float lengthRatio = length / getLength();
      x *= lengthRatio; // <- Will fail!, x is not Vector::anonymous_struct
      y *= lengthRatio; //    field. Moreover we do not have access to x here.
    }
  } length;
};

Will that work!? Nope. operator= body executed in Vector::anonymous_struct context not Vector context, so we do not have access to x and y fields of Vector instance, even in this case where by definition Vector::anonymous_struct may not exist outside of the Vector.

We need to somehow get back to the Vector context.

Property template proxy

Now using nested struct and type cast and operator= leads us to some more interesting solution:

// property proxy struct
template <typename T,
          class Parent,
          T (Parent::*Getter)(),
          void (Parent::*Setter)(T)>
struct property {
  operator T() { return (parent()->*Getter)(); }
  void operator=(T value) { (parent()->*Setter)(value); }

 private:
  // Ooops. How do we get pointer to the enclosing class?
  Parent* parent() { return reinterpret_cast<Parent*>(this); }
};

// our main class
class Node {
  int visitCount_;

 public:
  Node() : visitCount_(0) {}

  int getVisitCount() { return visitCount_; }
  void setVisitCount(int count) { visitCount_ = count; }
  property<int, Node, &Node::getVisitCount, &Node::setVisitCount> visitCount;
};

int main(int argc, char const* argv[]) {
  Node node;
  node.visitCount = 2;
  return 0;
}

Almost there, but property<...>::parent() implementation is invalid. We need to somehow figure out the pointer to the enclosing Node class from property’s this pointer.

Okie, so maybe we can pass offset to the field:

// property proxy struct
template <typename T,
          class Parent,
          T (Parent::*Getter)(),
          void (Parent::*Setter)(T),
          size_t Offset>
struct property {
  /* ... */
 private:
 Parent* parent() { return (Parent*)((char*)this-Offset); }
};

// our main class
class Node {
  /* ... */
  property<int, Node, &Node::getVisitCount, &Node::setVisitCount, offsetof(Node::visitCount)> visitCount;
};

Unfortunately this will not compile! offsetof cannot be called (yet) there on incomplete class definition.

Final fugly preprocessor based solution

// a little trick to fool compiler we are not accessing NULL pointer here
#define property_offset(type, name) \
  (((char*)&((type*)(0xffff))->name) - (char*)(0xffff))

#define property_parent(type, name) \
  ((type*)((char*)(this) - property_offset(type, name)))

// macro defining property
#define property(type, name, parent)                                         \
  struct name##_property {                                                   \
    operator type() { return property_parent(parent, name)->get_##name(); }  \
    void operator=(type v) { property_parent(parent, name)->set_##name(v); } \
                                                                             \
   private:                                                                  \
    char zero[0];                                                            \
  } name

// our main class
class Node {
  /* ... */
  property(int, visitCount, Node);
};

Caveats

  1. C++ standard prohibits empty classes or structs with size of zero, forcing compiler to set its size to at least 1 byte. This is to avoid two fields or variables share same pointer.

This is clear enough, however in our case property struct is just a proxy, and defining every property will make our class size grow.

GCC and Clang can be fooled with declaring char zero[0] private field, that somehow makes this struct violate C++ standard. Unfortunately this trick does not work for MSVC.

  1. We need to do fancy pointer arithmetic in property_offset. We cannot simply get offset using offsetof as it will fail with incomplete class error. We cannot calculate offset taking nullptr to Parent as some compilers will emit error on NULL pointer access.

  2. Finally because all of these, we cannot make this solution elegant and use only templates. We need some extra macros, that makes it a bit unreadable.

Comments appreciated!

Further reading