Ruby Enums - Approaches by eladmeidar
Where did my Enum go?
#Enum is probably one of the things I miss most in #Ruby. Not sure why - but as a fairly common design pattern I expected to see it more than I do in Ruby's Stdlib (or maybe even Core) and in Ruby / Rails implementations around.
Enum (or Factor if you will) is basically a set of values called members (or elements) that act as a constant and represent some sort of a numeric (mostly integer) value. It is mainly used to represent a string constant in a numeric way, allowing easier serialization of data representation and avoiding recurring types of long / annoying strings. Java, C#, and even our beloved neighbor Python - all provide some kind of an implementation to Enum as part of their toolbelt, some databases like (MySQL)[http://dev.mysql.com/doc/refman/5.0/en/enum.html], Postgres provide a specific implementation "by the book" for enum types but some others like SQL-Server (yes, Microsoft again) provide some black magic to achieve an Enum representation:
mycol VARCHAR(10) NOT NULL CHECK (mycol IN('Oranges', 'Apples', 'Pineapples'))
But, a decent Enum is still missing in Ruby.
Quick draw Enum in Ruby
If you ignore all the fancy words around the Enum definition, you are probably saying to yourself - "Eh, it's a hash".
Right.
The most simple way is to create a frozen hash constant that will hold all your constants:
class Canvas
COLORS = {
0 => :black,
1 => :white,
2 => :green,
...
}.freeze
end
testing it out:
$> Canvas::COLORS[0]
:black
This #implementation is rather simple and convenient - but not too easy to maintain and carries absolutely no syntactic sugar.
Note that if you want to access the numeric value via a given constant (ex: Canvas::Colors[:black]) you'll need to implement external access methods for the frozen hash. Another problem I have with this implementation is that it is too specific, if you want an other class
to use the COLORS "enum", you'll need to redefine it again.
Reusable Module Enum
The Reusable module implementation is very similar to the implementation that you'll find in C#, Python and Java - a free module that allows the enum to be re-used in multiple classes and is basically all about defining constants:
module ColorsEnum
BLACK = 0
WHITE = 1
GREEN = 2
end
class Canvas
include ColorsEnum
end
class Wall
include ColorsEnum
end
This form allows code re-using of course, but still lacks the ability to access the enum members by both numeric index and the constant.
Enum by method
Before reaching out to solve the dual access problem (isn't it annoying to declare constants manually?) lets add some syntactic sugar to the constant definition process.
class Object
def self.enumify(*args)
args.flatten.each_with_index do | const, i |
const_set(const, i)
end
end
end
class Bowl
enumify "ORANGES", "APPLES", "PINEAPPLES"
end
p Bowl::ORANGES # => 0
Now we are getting closer. We can't use this syntactic sugar anywhere in our app to define enums like the big boys do, but it still doesn't solve our dual access problem.
Enum with dual-access
The feature that almost all the Enum implementations I presented above are missing is the ability to access an enum member by either its value or by name.
The following implementation (original from here) suggests a more robust module based implementation, it adds an object implementation for each enum member to store all member attributes (index for example) and
extends the enum base class with the Enumerable module capabilities.
I used this base class and added some more attributes to the Enum::Member class to also store the actual value and allow dual access to both the key and the value of the member:
class Enum < Module
class Member < Module
attr_reader :enum, :index, :syme
def initialize(enum, index, syme)
@enum, @index, @syme = enum, index, syme
# Allow Color::Red.is_a?(Color)
extend enum
end
# Allow use of enum members as array indices
alias :to_int :index
alias :to_i :index
alias :to_sym :syme
alias :to_s :name
def name
self.syme.to_s
end
# Allow comparison by index
def <=>(other)
@index <=> other.index if other.respond_to?(:index)
end
include Comparable
end
def initialize(*symbols, &block)
@members = []
symbols.each_with_index do |symbol, index|
# Allow Enum.new(:foo)
symbol = symbol.to_s.sub(/^[a-z]/){|letter| letter.upcase}.to_sym
member = Enum::Member.new(self, index, symbol)
const_set(symbol, member)
@members << member
end
super(&block)
end
def all
all = {}
@members.each_with_index do |member, index|
all[index] = member
end
end
def [](val)
if val.is_a?(Numeric)
@members[val]
elsif val.is_a?(Symbol)
@members.select {|member| member.syme == val }.first
elsif val.is_a?(String)
@members.select {|member| member.name == val }.first
end
end
def size
@members.size
end
alias :length :size
def first(*args)
@members.first(*args)
end
def last(*args)
@members.last(*args)
end
def each(&block)
@members.each(&block)
end
include Enumerable
end
Lets focus on the Enum#[] method in line 49:
def [](val)
if val.is_a?(Numeric)
@members[val]
elsif val.is_a?(Symbol)
@members.select {|member| member.syme == val }.first
elsif val.is_a?(String)
@members.select {|member| member.name == val }.first
end
end
This method allows the dual-access mode I discussed before when doing the following
- When getting a
Numericvalue as a parameter it checks it against any memeber's#indexattribute - When getting a string it compares it to the
#nameattribute.
Voila!
This implementation stores the member list in an array member named @members and delegates manually all the Array and Enumerable methods to this array member.
Seems like we can simply inherit from Array and drop all these delegation methods.
Ruby Set based Enum
Set implements a collection of unordered values with no duplicates. By using Set as a parent we inherit its ability to
traverse members, we don't need to implement and manage delegation to an array member and we don't need to create constants (I'll discuss that later in this article):
require 'set'
class Enum < Set
class Member
include Comparable
attr_reader :symbol
attr_reader :index
attr_reader :name
def initialize(value, index = 0, name = nil)
@symbol = value.to_sym
@index = index.to_i
@name = name ||= value
end
def <=>(other_member)
self.symbol <=> other_member.symbol
end
end
def initialize(*members)
super()
populate(members)
end
def to_a
super.sort_by(&:index)
end
def each
block_given? or return enum_for(__method__)
self.to_a.each { |o| yield(o) }
self
end
def [](index_or_symbol)
if index_or_symbol.is_a?(Symbol)
return self.select {|member| member.symbol == index_or_symbol }.first
elsif index_or_symbol.is_a?(Integer)
return self.select {|member| member.index == index_or_symbol }.first
end
end
protected
def populate(members)
members.each_with_index do |member, index|
self.add(Enum::Member.new(member, index)) unless self.to_a.collect(&:name).include?(member)
end
end
end
First, I'll focus on the #populate method. This method is fired when #initialize finishes and appends all members applied to the Enum set.
Line 50 is where all the magic happens - it creates a new Enum::Member and adds it to the set, unless it already exists.
Wait! Aren't set members already unique by default?
Yes, they are. But this implementation runs into a Ruby brick wall. The ruby Set keeps an internal hash (named @hash surprisingly) that holds the member as keys and sets a value of true for each one when they are entered.
Further more, Set#include? is delegated to Hash that on its turn - runs a native C code that compares objects directly.
In our case it will result in something like this:
Enum::Member.new.object_id == Enum::Member.new.object_id
Which, no matter if both members have the same attributes, will return false.
Set#include? does not exercise the Comparable module convenience I included in Enum::Member, so we need to do a comparison on our own.
And why didn't you use Constants?
It is easy and smart to use constants when you can take care of any naming conventions and constant name limitations that Ruby introduces, in our case
we allow Enum::Member to initialize with any kind of string, even ones that don't pass the constant name restrictions, some examples are:
Enum::Member.new("888betonline", 0).symbol #=> :"888betonline"
# Same in a constant
Object.const_set(888BETONLINE, 0)
SyntaxError: compile error
(irb):10: syntax error, unexpected tCONSTANT, expecting ')'
User.const_set(888BETONLINE, 0)
^
Since we don't know or even want to restrict our enum value names, dropping the constant convention out of the loop seemed like the right thing to do.
Update: A clean, simple and perfect Hash
After this post was published, a talked to my friend Daniel about the Set implementation. He suggested the the Set usage is
not required since that #include? bug causes me to write the presence test myself and that anyway, I traverse the members myself so the Set has no meaning.
he later came up with this suggestion:
class Enum < Hash
def initialize(*members)
super()
@rev = {}
members.each_with_index {|m,i| self[i] = m }
end
def [](k)
super || @rev[k]
end
def []=(k,v)
@rev[v] = k
super
end
end
enum = Enum.new("Apples", "Oranges")
enum[:Apples] # => 0
enum[0] # => :Apples
This solution, not only answer the dual access requirement - but it drops the need to use an internal Enum::Member inclusion. Definitely a winner
Conclusion
Ruby misses a real enum implementation. I would be happy to see an official one come out in one of the next releases. Until then the Hash implementation is something I will be using.
Source available here, fork away.
Comments
makaroni4 commented 3 months ago
dbenjamin57 commented 3 months ago
This is great. Thanks!
makaroni4 commented 3 months ago
@eladmeidar would you mind if I send your post to #ruby reddit? Or you can do it yourself
eladmeidar commented 3 months ago
@makaroni4 it was already submitted. i think.
makaroni4 commented 3 months ago
@eladmeidar seems like no: http://www.reddit.com/search?q=url%3Ahttp%3A%2F%2Fgistflow.com%2Fposts%2F682-ruby-enums-approaches
eladmeidar commented 3 months ago
@makaroni4 feel free :)
britishtea commented 3 months ago
It seems to me that you could simply use an Array here, as the final Enum is still mutable. You'll also still need to get it into your class somehow. Your first solution is still the most elegant, imo.