A Practical Example For HoledRange

Someone asked me a non mathematical example of using the Domain thing, so here goes.

Let's imagine I am writing a network mapper or a load balancer of some description. I have a range of IPs that are available and I want to pick a few from them, or check that an IP is in a white or black list that's handled by ranges.

First I have to define what an IP address is (I went IPV4 because it's shorter): a group of 4 numbers from 0 to 255 (0 and 255 are "special", but it only affects a tiny portion of the code). I want to build them from that representation, strings (because we love strings) and a 32 bits number because that's what it ultimately is:

struct IPAddress {
    var group1: UInt8
    var group2: UInt8
    var group3: UInt8
    var group4: UInt8
    
    init(_ c1: UInt8, _ c2: UInt8, _ c3: UInt8, _ c4: UInt8) {
        group1 = c1
        group2 = c2
        group3 = c3
        group4 = c4
    }
    
    init?(_ s: String) {
        let comps = s.components(separatedBy: ".")
        if comps.count != 4 { return nil }
        if let c1 = UInt8(comps[0]), let c2 = UInt8(comps[1]), let c3 = UInt8(comps[2]), let c4 = UInt8(comps[3]) {
            group1 = c1
            group2 = c2
            group3 = c3
            group4 = c4
        } else {
            return nil
        }
    }
    
    init(_ ip: UInt32) {
        let c1 = ip >> 24
        let c2 = (ip >> 16) & 0x00ff
        let c3 = (ip >> 8) & 0x0000ff
        let c4 = ip & 0x000000ff
        group1 = UInt8(c1)
        group2 = UInt8(c2)
        group3 = UInt8(c3)
        group4 = UInt8(c4)
    }
    
    var asString : String {
        return "\(group1).\(group2).\(group3).\(group4)"
    }
    
    var asUInt32 : UInt32 {
        let c1 = UInt32(group1) << 24
        let c2 = UInt32(group2) << 16
        let c3 = UInt32(group3) << 8
        let c4 = UInt32(group4)
        
        return c1 | c2 | c3 | c4
    }
}

I also included to string and to 32 bits because I could, and because it's generally good practice to be able to output what you accept as input.

If I want to make a domain out of that, they need to be Hashable, and Comparable:

extension IPAddress : Equatable, Comparable, Hashable {
    static func == (lhs : Self, rhs : Self) -> Bool {
        return lhs.group1 == rhs.group1 && lhs.group2 == rhs.group2 && lhs.group2 == rhs.group3 && lhs.group4 == rhs.group4
    }
    static func != (lhs : Self, rhs : Self) -> Bool {
        return !(lhs == rhs)
    }
    static func < (lhs : Self, rhs : Self) -> Bool {
        if lhs.group1 < rhs.group1 { return true }
        else if lhs.group1 == rhs.group1 && lhs.group2 < rhs.group2 { return true }
        else if lhs.group1 == rhs.group1 && lhs.group2 == rhs.group2 && lhs.group3 < rhs.group3 { return true }
        else if lhs.group1 == rhs.group1 && lhs.group2 == rhs.group2 && lhs.group3 == rhs.group3 && lhs.group4 < rhs.group4 { return true }
        else { return false }
    }
    static func <= (lhs : Self, rhs : Self) -> Bool {
        return lhs == rhs || lhs < rhs
    }
    static func > (lhs : Self, rhs : Self) -> Bool {
        if lhs.group1 > rhs.group1 { return true }
        else if lhs.group1 == rhs.group1 && lhs.group2 > rhs.group2 { return true }
        else if lhs.group1 == rhs.group1 && lhs.group2 == rhs.group2 && lhs.group3 > rhs.group3 { return true }
        else if lhs.group1 == rhs.group1 && lhs.group2 == rhs.group2 && lhs.group3 == rhs.group3 && lhs.group4 > rhs.group4 { return true }
        else { return false }
        
    }
    static func >= (lhs : Self, rhs : Self) -> Bool {
        return lhs == rhs || lhs > rhs
    }
    
    func hash(into: inout Hasher) {
        into.combine(group1)
        into.combine(group2)
        into.combine(group3)
        into.combine(group4)
    }
    
    var hashValue: Int {
        return Int(self.asUInt32)
    }
}

This stuff isn't quite boilerplate, but not very far from it. Comparing two IP addresses is about comparing the 4 groups of numbers, in order of importance.

Finally, because I want to sample my ranges, I'd like a Randomizable implementation as well, which is more interesting than it seemed at first glance:

extension IPAddress : Randomizable {
    static func randomElement() -> IPAddress? {
        
        let c1 = UInt8.random(in: 1...254)
        let c2 = UInt8.random(in: 1...254)
        let c3 = UInt8.random(in: 1...254)
        let c4 = UInt8.random(in: 1...254)
        
        return IPAddress(c1,c2,c3,c4)
        
    }
    static func randomElement(in r: ClosedRange<IPAddress>) -> IPAddress? {
        let lb = r.lowerBound
        let ub = r.upperBound
        
        if lb == ub { return lb }
        if ub < lb { return nil }
        
        let c1 = UInt8.random(in: lb.group1...ub.group1)
        let c2: UInt8
        let c3 : UInt8
        let c4 : UInt8
        
        if c1 == lb.group1 || c1 == ub.group1 { // special case , because we have to check the more minor parts of the group as well
            if c1 == lb.group1 && c1 == ub.group1 { // same group 1 for upper and lower bounds
                c2 = UInt8.random(in: lb.group2...ub.group2)
            } else if c1 == lb.group1 {
                c2 = UInt8.random(in: lb.group2...254)
            } else { //  if c1 == ub.group1
                c2 = UInt8.random(in: 1...ub.group2)
            }
            // same problem, again
            if c2 == lb.group2 || c2 == ub.group2 {
                if c2 == lb.group2 && c2 == ub.group2 {
                    c3 = UInt8.random(in: lb.group3...ub.group3)
                } else if c2 == lb.group2 {
                    c3 = UInt8.random(in: lb.group3...254)
                } else {
                    c3 = UInt8.random(in: 1...ub.group3)
                }
                // and finally
                if c3 == lb.group3 || c3 == ub.group3 {
                    if c3 == lb.group3 && c3 == ub.group3 {
                        c4 = UInt8.random(in: lb.group4...ub.group4)
                    } else if c3 == lb.group3 {
                        c4 = UInt8.random(in: lb.group4...254)
                    } else {
                        c4 = UInt8.random(in: 1...ub.group4)
                    }
                } else {
                     c4 = UInt8.random(in: 1...254)
                }
            } else {
                c3 = UInt8.random(in: 1...254)
                c4 = UInt8.random(in: 1...254)
            }
        } else {
            c2 = UInt8.random(in: 1...254)
            c3 = UInt8.random(in: 1...254)
            c4 = UInt8.random(in: 1...254)
            
        }
        
        return IPAddress(c1, c2, c3, c4)
    }
}

As for why random IPs don't contain 0 or 255, it's because they are special in some cases and I wanted to spare myself the headache. Note the hierarchy of groups as well for randomization...

Finally a test program:

       if let ip = IPAddress.randomElement(), let ip2 = IPAddress.randomElement() {
            let mi = min(ip,ip2)
            let ma = max(ip,ip2)
            let d = Domain(mi...ma)
            if let ip3 = d.randomSample(5) {
                print("["+ip.asString+";"+ip2.asString+"] => \(ip3.map({ $0.asString }))")
            }
        }

        let d = Domain(IPAddress(192,168,125,1)...IPAddress(192,168,125,10))
        if let ip3 = d.randomSample(5) {
            print("["+d.lowerBound!.asString+";"+d.upperBound!.asString+"] => \(ip3.map({ $0.asString }))")
        }
[58.47.159.20;110.200.52.86] => ["92.196.40.9", "89.184.245.202", "69.19.204.59", "101.41.128.20", "74.231.21.246"]
[192.168.125.1;192.168.125.10] => ["192.168.125.4", "192.168.125.1", "192.168.125.9", "192.168.125.3", "192.168.125.10"]

Seems legit.

That's a Nice Range! Can I Poke Holes In It?

Because of various Machine Learning and brain teaser problems I focused on, I have been developing for my own use an "extension" to the standard swift type Range that allowed for a very particular behavior: having holes.

TL;DR: Grab the code/package on the github repo

Why?!? 😳

I need this class for two scenarii (that's the plural of scenario, deal with it):

  • Being able to sample continuous data sets (eg "I want a value between 0 and 1, but not between 0.45 and 0.55 because the middle sucks")
  • Being able to reconcile continuous data sets (eg "This tells me it can be between 0 and 1, and this tells me it can be between 10 and 11")

In both instances, I ended up writing a lot of uninteresting and similar code, so I decided to put it all under a single umbrella.

A little bit of personal history and maths 🤕

Practictionners of the dart art known as maths entertain a love/hate relationship with "domains". It's one of these caveats in otherwise universal rules and theorems that restrict the use of all these wonderful tools, and both helps with proofs and hinders usage.

Some functions have a domain definition (eg 1/x exists for every value but x=0. No, infinity isn't the answer), some have a resulting domain (eg is always positive), and some functions have a really weird set of both.

The main problem with maths is that, for a lot of people, it isn't very... permissive, shall we say. There seems to be more "don't"s than "do"s, and the practical application of all these wonderful theorems seems out of reach.

But it's not true that maths is counter-intuitive. I am a firm believer that the "intelligence" we humans exhibit, as opposed to our genetic relatives on the tree of life, is strongly related to our brain's ability to do maths. If I have 4 apples today and eat one per day, I'd better find a supply for apples very shortly, because I won't have any apple left soon. That ability to extrapolate numbers and trends is what differentiates us from, say, our forager ancestors who went looking for food when they were hungry, and didn't plan much ahead.

It is my humble opinion that language exists because of our ability to do maths, and not the other way around. Of course, you might argue that advanced maths is in itself a new language, and quite disconnected from my need to find some apples for next week, but to me, it looks like parallel evolution rather than anything else.

Our brains are wired to count, extrapolate, share, split, combine, and calculate. The rest is a strategy to ensure that I have an infinite supply of apples. Maybe my offspring too, if they behave nicely.

Because of my special condition, maths isn't any less intuitive than driving or drawing. It's just a matter of practice.

The more immediate problem 🙄

Let's say I have a data set that tells me that a variable can be either negative or positive, but that its absolute value can't be beyond 10, and it can't be 0.

[-10;0[ ∪ ]0;10]

That's easy right? Every time I update that variable, I can always use something like:

v = min(10,max(-10,v)) // for constraints
if v == 0 { // ?
}

What if it's something more complicated? Something that exists between 0 and 1 or between 10 and 11, or between 100 and 101?

[0;1] ∪ [10;11] ∪ [100;101]

My list of min and max calls and if/else or switch statements will become very complicated very soon...

What if it's the intersection of two sets of ranges like the one above? Think marketing segmentation, where you are interested in age groups intersected over two different statistical analyses...

What if it's a large range excluding certain values or ranges?

Of course, because we are dealing with computers, we have the other problem to contend with: infinity does not exist

Implementation choices 🤓

In maths, compound ranges are a collection of open and closed ranges linked together by unions. In swift, Range and ClosedRange are distinct unrelated types. Having a collection that contains both is painful for standard operations like unions and merging and symmetric difference, because it implies changing the type of object depending on the operations:

]0;1[ ∩ [-1;0.5] = ]0;0.5]

(please note the changes in inclusion and exclusion of the bounds)

I decided to go with ClosedRange and a touch of excluded values to deal with open ranges. The reasoning behind that is that it also allows me to exclude single values easily, which is one of the main uses of this library.

Most of the standard operations are fairly easy to implement (adding or unioning two ranges is just adding a new item to the collection of ranges, removing ranges means changing the bounds, etc), but they often require an optimizitation pass, which is algorithmically complex: I need to merge overlapping ranges to keep the list as small as possible.

Intersection is pretty straightforward as well: you "just" remove the ranges that aren't common. It is implemented in two steps:

  • find the ranges that share values (overlaps)
  • shrink the result to the common values with min and max

The most interesting one (for me) is the symmetric difference of two ranges: elements that are in either range, but not in both (good old XOR).
The idea here is pretty similar: we have to find the intersection (the common elements) and remove them from the union (all the elements of both ranges).

The problem lies, of course, in the details. Because my Bound type conforms to Comparable and my range needs to satisfy lowerBound <= upperBound, I need to check how the ranges overlap. Is one containing the other? Is one straddling the bound of the other one? Draw it on a piece of paper and you'll see we have a lot of different scenarii (AGAIN!)

What's next? 😱

I don't know. This library works for my current purposes. I will probably continue to add stuff to it as time passes. I have in the back of my mind an idea of how to use this for a constraint solver engine. Maybe.

If you find it useful or want to add stuff to it, feel free!  I'm sure there are some bugs and edge cases I havent' taken into account. I tried to document it fully and provide the unit tests that I thought were appropriate, so if you decide to contribute, please carry on with those two good habits.