Swift String Demystified

A basic knowledge of Unicode character set and UTF-8, UTF-16 and UTF-32 encoding would help to understand Swift Strings better. If you are not aware of these, please consider reading /unicode-character-set-and-utf-8-utf-16-utf-32-encoding/.

This tutorial is updated for Swift 3.

Swift Strings Representation

Swift Strings are a collection of characters. For example "Hello World", "Señor" or "1 Infinite Loop, Cupertino" are all Swift strings.

Strings are represented using the String type in Swift.

var a: String = "Hello World"  
let b = "Welcome to the Club"  

Swift strings are Unicode compliant under the hood. This ensures that any character from any language can be used in swift.

The code below contains a String with acute accent ´.

var unincodeSample = "Café"  
print(unincodeSample) // prints Café  


String Mutability

Strings can be created to be mutable or immutable using the var or let keyword respectively. Mutable strings can be changed after they are declared, whereas this is not possible in immutable strings as they are constants.

Lets look at an example where we declare a mutable string and then append another string to it

var name = "Steve"  
name += " Jobs" // appending Jobs to name  
print(name) // Steve Jobs is printed  

The + operator is used to append two strings.

let declares immutable strings. Once a string is declared using let, no more modifications are allowed on that string.

let city = "New York"  
city += " ,USA" // compilation error  


Empty Strings

Empty strings can either be declared either using two empty quotes "" or using a string initializer.

var name = ""  
var address = String()  

We can append new strings to the empty strings.

var name = ""  
var address = String()  
name += "Steve Jobs"  
address += "Cupertino"  
print("Name:",name, ", Address:",address) //"Name: Steve Jobs , Address: Cupertino" is printed  


Testing for empty string

Strings can be tested for emptiness using the isEmpty property.

var nowhere = ""  
if nowhere.isEmpty {  
    print("nowhere is empty")
}

The above program outputs "nowhere is empty"

String Interpolation

String interpolation is a simple way to create new strings by inserting other Strings, Integers, Floats and so on. String interpolation is achieved by placing the item which we want to insert into the string between a backslash followed by a opening and closing paranthesis \()

Lets look at how this works with an example. Lets assume that we have the price of the book and we want to prefix the price with a "$" symbol. This can be achieved string interpolation.

let price = 250  
let currency = "$"  
let displayPrice = "\(currency)\(price)"//interpolation  
print("price of the book is",displayPrice)  

The above program outputs price of the book is $250. The displayPrice property is created by interpolating currency \(currency) and price \(price)

Lets write another program which prints the multiplication tables of a number using interpolation.

var number = 8  
for i in 1...10 {  
    print("\(number) times \(i) is","\(number * i)")
}

The above uses interpolation for displaying number \(number), i \(i) and the product of number and i \(number * i)

This program outputs

8 times 1 is 8
8 times 2 is 16
8 times 3 is 24
and so on.

Unicode Scalars

Swift strings can also be constructed using unicode scalars. A unicode scalar is nothing but a unicode code point represented using UTF-32 encoding. The syntax for constructing a string from a unicode scalar is \u{n} where n is the hexadecimal code point.

Lets construct a string from a unicode scalar.

The Unicode code point of A is U+0041. Using this code point, the String A can be constructed using the format \u{0041}

var a = "\u{0041}" //string created using unicode scalar  
print(a) // Prints **A**  

The above program outputs A.

The program below creates our beloved Hello World string using unicode scalars.

let beloved = "\u{0048}\u{0065}\u{006C}\u{006C}\u{006F} \u{0057}\u{006F}\u{0072}\u{006C}\u{0064}"  
print(beloved) //prints Hello World  


Different combinations of unicode scalars can produce the same character.

Lets look at what I mean with an example.

let nTilde = "\u{00F1}" // prints ñ  
let nTildeCombination = "\u{006E}\u{0303}" // This also prints ñ  

In the above program the latin alphabet ñ is constructed using the code point U+00F1 in the first line.

The other way of creating ñ is to use the code point for n U+006E and then follow it with the code point for ~ U+0303. Combining both into a single string "\u{006E}\u{0303}" also results in the creation of ñ

Accessing the Unicode Scalars of a String

The Unicode Scalars of a string can be accessed using the unicodeScalars property. Lets write a program that uses this property to print the unicode scalars of a string.

let enjoy = "Let's go 🏊"  
enjoy.unicodeScalars.forEach {  
    let hex = String(format:"%X",$0.value)
    print(hex, terminator: " ")
}

The above program uses the value property of UnicodeScalar which provides the numeric representation of the unicode scalar. It is converted to hexadecimal using "%X" format specifier and printed. The terminator parameter in the print is used to specify space as the terminator for print instead of the default newline.

Character

A character is a single unicode scalar or a group of unicode scalars that form a single human readable letter.

Character is represented using the Character type in swift

Lets let look at some codes sample to better understand Character. The code point of A is U+0041

let aChar: Character = "\u{0041}"  
print(aChar) //prints A  

In the above example, the single unicode code point \u{0041} forms the human readable character A. Hence aChar is a character.

The french letter é is formed by the combination of code points of e represented by U+0065 followed by acute accent ´ represented by U+0301

let acuteAccent: Character = "\u{0065}\u{0301}"  
print(acuteAccent) //prints é  

In the above example, two code points U+0065 and U+0301 combine and form the single human readable character . Hence acuteAccent is a Character and its value is

Characters can also be created using the human readable representation directly.

let aHumanReadable: Character = "A"  
let acuteAccentHumanReadable: Character = "é"  
print(aHumanReadable, acuteAccentHumanReadable)// prints A é  

The above example is pretty straight forward. It prints A and

Accessing characters of a string

String type has a characters property which provides access to all the characters of a string. This property can be used to iterate through the individual characters of a string.

Lets write some code to understand characters better.

let happy = "I ate 🍕 at a restaurant near my home"  
for char in happy.characters {  
    print(char, terminator: "")
}

In the above program happy.characters returns the collection of characters of the property happy and it is iterated using a for loop and printed. This program outputs

I ate 🍕 at a restaurant near my home

Finding the length a string

The length of the string is the total number of human readable characters in that string i.e. the number of Swift Character types in that String.

The length of the string can be found using the count property on the characters collection of the string.

let mystring = "Hello World"  
print("length is ",mystring.characters.count)  

The above program outputs length is 11

Lets look at one more example to make our understanding of characters and string length more clear.

let senorHumanReadable = "Señor" //Señor  
let senorUnicode = "Se\u{00F1}or" //this is also Señor  
let senorExtendedUnicode = "Se\u{006E}\u{0303}or" //even this is Señor

print("length of senorHumanReadable =",senorHumanReadable.characters.count)//prints 5  
print("length of senorUnicode =", senorUnicode.characters.count)// prints 5  
print("length of senorExtendedUnicode =", senorExtendedUnicode.characters.count)// prints 5  

In the above code the length of all three strings senorHumanReadable, senorUnicode, senorExtendedUnicode is the same and it is 5.

As we know already, the character ñ in the String Señor can be represented either as ñ or using Unicode Scalar as \u{00F1} or combination of Unicode Scalars \u{006E}\u{0303}

So all these three refer to the same human readable string Señor. Hence all of them are of the same length 5.

String Indices

As we already know swift strings are a collection of characters. A character can be a single unicode scalar or a combination of unicode scalars. Characters are not fixed size and can occupy any amount of memory.

The english alphabet A will occupy 1 byte whereas the emoji 😭 will occupy 4 bytes if encoded using UTF-8. To search and replace A with some other character, only 1 byte has to be changed, whereas if the emoji 😭 has to be replaced with some other character, 4 bytes has to be changed. This is not possible if integers are used to index strings. If integers are used to index strings, it is assumed that all characters occupy 1 byte. Say we start indexing a string using integer incrementation and get the first character from the 0th byte. If we try to get the second character of the string using the 1st byte, we might end up getting a wrong value since the first character may have occupied 2 bytes. This is where index type comes into picture.

Each String in Swift has a Index type which gives us the position of the character in a string.

The startIndex property returns the position of the first character in the string.

The endIndex property returns the position after the last character of the string. Hence accessing a character at endIndex will result in an error.

Lets look at some operations that can be performed on strings to make our understanding of indexes more clear.

String Operations

Accessing the first character of a string

let myString = "Hello World"  
let firstChar = myString[myString.startIndex] //H  

myString.startIndex will return the index of the first character in the string. We use the subscript syntax myString[myString.startIndex] to get the first character of a string.

Accessing the last character of a string

let myString = "My Café"  
let lastIndex = myString.index(myString.endIndex, offsetBy: -1) //index of the last character  
let lastChar = myString[lastIndex] //é  

myString.endIndex gives the index after the last character of the string. The method index(String.Index, offsetBy: String.IndexDistance) offsets the index by the number that is specified in the offsetBy: parameter. We pass the endIndex to this method and offset it by -1 to get the index of the last character of the string.

Indices of all characters of a string

The indices property of the characters property returns the index of each and every character of a string

let senor = "Hello Señor"  
for index in senor.characters.indices {  
    print(senor[index], terminator: "")// prints Hello Señor
}

In the above program senor.characters.indices is iterated using a for loop and every character of the string is printed using the subscript sytax senor[index]

Inserting characters into a string

The String method func insert(Character, at: String.Index) is used to insert a single character at position String.Index

Lets write a program to insert a character at the first position of the string.

var testString = "ello world"  
testString.insert("H", at: testString.startIndex)//test String is now "Hello world"  

The above program inserts H in the first position of the string.

Lets insert a character at the end of a string.

var food = "Pizz"  
food.insert("a", at: food.endIndex) //food is now Pizza  

food.endIndex gives the position right after the end of food and a is inserted in that position. The result is Pizza.

Consider the String "iPhone runs OS". Lets write a program to convert this string to "iPhone runs iOS".

var phoneOS =  "iPhone runs OS"

if let range = phoneOS.range(of: "OS") {  
    if let position = phoneOS.range(of: "OS")?.lowerBound {
        phoneOS.insert("i", at: position) // now phoneOS is "iPhone runs iOS"
    }

}

The range(of:) method returns a range Structure which contains the starting index of the String "OS" in the String "iPhone runs OS" and this can be accessed using the lowerBound property. Now that we have the position to be inserted, we use the method func insert(_ newElement: Character, at i: String.Index) to insert the character "i" at the appropriate position and this results in the String "iPhone runs iOS"

So far we have seen methods that insert a single character into a string. Now lets look at methods that insert a String into a String.

Lets write a program to insert a string at the beginning of another string.

var awesomeLanguage = "is a awesome language"  
awesomeLanguage.insert(contentsOf: "Swift ".characters,  
                       at: awesomeLanguage.startIndex) //Swift is a awesome language

The insert(contentsOf:at:) function inserts a collection of characters at a specified index. We convert the String "Swift " to a collection of characters and insert it at the start position of the string and we get the output "Swift is a awesome language"

Search and replace

A common operation is to search for a substring within a String and to replace it with some other string. We will replace "oranges" in the string "I love oranges a lot" with "apples"

Lets see how this is done in Swift.

var loveFood = "I love oranges a lot"  
if let rangeOfOranges = loveFood.range(of: "oranges") {  
    loveFood.replaceSubrange(rangeOfOranges, with: "apples")
}

The method func replaceSubrange(_ bounds: Range, with newElements: String) takes a range as input and replaces the string in that range with the given string. Here we pass the range of oranges and replace it with apples

Substrings of a String

The following program print all the substrings of a string.

var book = "Dan Brown is the Author of Da Vinci Code"  
for substring in book.components(separatedBy: " ") {  
    print(substring)
}

The components(separatedBy: String) method returns an array of strings which contains substrings that have been divided by the separator. The above program outputs,

Dan
Brown
is
the
Author
of
Da
Vinci
Code

Remove characters in a String

The remove(at: String.Index) method removes the character at the specified index.

The following program removes the last character of a string

var singular = "Pizzas"  
singular.remove(at: singular.index(before: singular.endIndex))  
print(singular)  

Appending characters to a String

One other common operation performed on Strings is to append a single character and also to append strings to existing String. Lets look at how this is done.

The + operator is used to append strings and create a new one.

var sentence = "Cars make" + " commuting easier"  
print(sentence) //Cars make commuting easier  

The append(_:) method can also be used to append characters and strings.

var health = "Apple"  
health.append("s")  
health.append(" are good for health")  

In the above program, first the character "s" is appended to testString and then the String " are good for health" is appended to testString.

String Comparison

Strings are compared using the == operator. Two strings are considered equal if their human readable representations are the same.

Lets look at an example.

var unicodeScalarAccent = "I was at the caf\u{E9}"  
var humanReadableAccent = "I was at the café"  
if unicodeScalarAccent == humanReadableAccent {  
    print("both the strings are equal")
}

In the above example both the Strings represent the same human readable text. Hence they are both equal.

Thats it for the basics of Strings. I recommend reading the Swift String Documentation which has tons of useful methods that will make your life easier when it comes to string processing.

Thanks for reading and please leave your thoughts in the comments section.

comments powered by Disqus