Landing on Elixir: Processing Immutable Data
2021-08-11 Elixir data struct immutableAcknowledgements
I want to thank Po Chen for reviewing this post and providing valuable feedback.
- Landing on Elixir: Pattern Matching
- Landing on Elixir: Processing Immutable Data
The pain of processing immutable data
The other day I saw a block of code whose author had struggled in processing the data. Here’s a simplified version of it:
defp transform_store(store) do
retailers = store["retailers"]
logo_url = retailers["logoUrl"]
retailers =
if logo_url != nil do
retailers = Map.delete(retailers, "logoUrl")
logo_url = @prefix <> logo_url
Map.put(retailers, "logoUrl", logo_url)
else
retailers
end
retailer_id = retailers["id"]
company = retailers["company"]
company_map_id = company["id"]
company = Map.delete(company, "id")
company = Map.put(company, "_id", company_map_id)
retailers = Map.delete(retailers, "id")
retailers = Map.delete(retailers, "company")
retailers = Map.put(retailers, "_id", retailer_id)
retailers = Map.put(retailers, "company", company)
store_map_id = store["id"]
store = Map.delete(store, "id")
store = Map.delete(store, "retailers")
store = Map.put(store, "retailer", retailers)
store = Map.put(store, "_id", store_map_id)
store
end
I constrained to figure out that what it does is to transform some data with the following shape:
%{
"point" => [_latitude, _longitude],
"id" => "store-id",
"retailers" => %{
"id" => "retailer-id",
"logoUrl" => "/path/to/logo",
"company" => %{
"id" => "company-id"
}
}
}
into:
%{
"_id" => "store-id",
# note that `retailers` has been renamed to
# `retailer` (singular form)
"retailer" => %{
"_id" => "retailer-id",
"logoUrl" => @prefix <> "/path/to/logo",
"company" => %{
"_id" => "company-id"
}
}
}
I felt the struggle in writing the code because the author was not familiar with APIs provided by Elixir.
But indeed, processing immutable data in Elixir can be more fun. In this article, we'll discuss some methods to process data in Elixir, and see how this code can be written into maintainable and reusable code.
To warm-up, let's discuss some basic concepts of processing immutable data.
Immutable Data
Immutable data is not a new concept. It is a fundament of functional programming. But this may not be obvious at first, and we may struggle in dealing with data, especially in a deeply nested shape.
Technically, we can not update the data. What we can do is using a function to transform it into a new piece of data, leaving the original data unchanged, as the following depicts:
The original data (O) is still available after processed, and the new output data (O') is different. They can share some common parts for better performance, but they are two objects.
The concept is simple, but we need to get used to it because it leads to different processing patterns than mutable data.
Click on the titles to expand them.
Updating a value in a map
Let's assume you have a piece of data like this:
data = %{"id" => "ABCDEF"}
And you need to lower case the id value ("ABCDEF"
), with all other fields remaining the same. The expected result is:
iex> downcase_id.(data)
%{"id" => "abcdef"}
Solutions:
You can achieve it by using the |
(map specific update) operator:
downcase_id = fn %{"id" => id} = data ->
%{data | "id" => String.downcase(id)}
end
downcase_id = fn data ->
Map.update!(data, "id", &String.downcase/1)
end
There are other helpful functions provided by the Map module, e.g.
replace/3
for replacing a value under an existing keyupdate/4
for updating a value based on its original value and setting a default if not already presentingput/3
for setting a key, value pair no matter if it is already in the mapput_new/3
for setting a new key and its value
You can choose the right function to use depending on whether the key is already in the map and if the update is based on the original value.
Changing the key name in a map
What if you need to change the key's name? For example, for the following map, to change the key name from _id
to id
:
# from:
%{"_id" => "example"}
# to:
%{"id" => "example"}
Map.pop/3
is a good fit here:
fn data ->
# first we remove the old key ("_id"), and save its value
{id_value, temp_data} = Map.pop(data, "_id")
# optionally, restructure we may compute a new value based on the
# original value:
#
# id_value = process_id_value(id_value)
# then we put back the id value under a new key ("id")
Map.put(temp_data, "id", id_value)
end
Updating a nested map
What if the target is not at the top level of the map? For example, to downcase the id value in the following data:
%{
a: %{
b: %{
c: %{
"id" => "EXAMPLE" # <- change it to "example"
}
}
}
}
One solution can be:
fn data ->
Map.update!(data, :a, fn data_a ->
Map.update!(data_a, :b, fn data_b ->
Map.update!(data_b, :c, fn data_c ->
Map.update!(data_c, "id", &String.downcase/1)
end)
end)
end)
end)
Looks too complex? We can employ Kernel.update_in/2
to simplify it:
fn data ->
update_in(data.a.b.c["id"], &String.downcase/1)
end
Updating a list
Let's say we want to convert the second item in the list, upcasing its value:
# before
["a", "b", "c", "d"]
# after
["a", "B", "c", "d"]
We can use List.update_at/3
to update any data at a given index in a list:
fn data ->
List.update_at(data, 1, &String.upcase/1)
end
The before and after lists look like this:
This is expensive when the list is big because we may restructure n heads again to generate the new list.
Updating all fields recursively
In some cases, we don't know exactly where the position of the target is. For example, let's say we want to replace all the image URLs in the following data, replacing "example.com"
to "image.example.com"
:
# orignal data
%{
"data" => %{
- "logo" => "http://example.com/logo.png",
+ "logo" => "http://image.example.com/logo.png",
"articles" => [
%{
"title" => "Article 1",
"image" => %{
- "url" => "http://example.com/1.png"
+ "url" => "http://image.example.com/1.png"
}
},
%{
"title" => "Article 2",
"image" => %{
- "url" => "http://example.com/2.png"
+ "url" => "http://image.example.com/2.png"
}
}
]
}
}
We can use Enum.map/2
(or Map.new/2
for maps) to update recursively.
For a map, we iterate all its {key, value}
pairs:
def process(map) when is_map(map),
do: map
|> Enum.map(fn {key, value} ->
{key, process(value)}
end)
|> Enum.into(%{})
or
def process(map) when is_map(map) do
for {key, value} <- map, into: %{} do
{key, process(value)}
end
end
or
def process(map) when is_map(map),
do: map
|> Map.new(fn {key, value} ->
{key, process(value)}
end)
For a list, we iterate all items inside it:
def process(list) when is_list(list),
do: Enum.map(list, &process/1)
For a binary, we apply the replacing on it:
def process(url) when is_binary(url),
do: String.replace(
url,
"http://example.com/",
"http://image.example.com/"
)
Otherwise, we just return the original data:
def process(other), do: other
So the final function would be:
def process(map) when is_map(map) do
for {key, value} <- map, into: %{} do
{key, process(value)}
end
end
def process(list) when is_list(list),
do: Enum.map(list, &process/1)
def process(url) when is_binary(url),
do:
String.replace(
url,
"http://example.com/",
"http://image.example.com/"
)
def process(other), do: other
This, of course, is not complete for all projects and you may need to iterate other data structs too. But the point is that you can use this pattern to transform your data’s shape recursively.
The refactored code
Let's look back at the code mentioned at the beginning.
The code smells bad because it uses Map.delete/2
and Map.put/3
only which are not ideal here. It can be refactored as:
def transform_store(store) do
store
# Let's separate the update on retailer into another small function.
|> Map.update!("retailers", &transform_retailer/1)
# Since replacing key name is so common, it is now a new function.
|> replace_key_in(~w[id], "_id")
# We reuse the function to change map key.
|> replace_key_in(~w[retailers], "retailer")
end
defp transform_retailer(retailer) do
retailer
# Here we use `Map.update/4` to update the logo url.
# The result is slightly different than the original
# version but should be OK.
|> Map.update("logoUrl", nil, &(@prefix <> &1))
# Again, we use `replace_key_in/3` to change key name.
|> replace_key_in(~w[id], "_id")
|> replace_key_in(~w[company id], "_id")
end
And here's the definition of replace_key_in/3
:
defp replace_key_in(data, [old_name], new_name),
do: replace_key(data, old_name, new_name)
defp replace_key_in(data, [_ | _] = path, new_name) do
{path, [old_name]} = Enum.split(path, -1)
update_in(
data,
path,
&replace_key(&1, old_name, new_name)
)
end
defp replace_key(data, old_name, new_name) do
{value, rest} = Map.pop(data, old_name)
Map.put(rest, new_name, value)
end
Let's quickly recap what we have done to improve it.
- Since replacing a key in a map is so frequent here, we write a helper function
replace_key_in/3
to keep the domain function clear to read. Furthermore, this helper function can be extracted to another place which can be reused by other modules or projects. - We use
Kernel.update_in/3
to update the deeply nested map. - Several Map APIs (
Map.update!/3
,Map.pop/2
, andMap.put/3
) are employed to serve different purposes.
I believe the refactored version is more readable, maintainable, and reusable.
Conclusion
We have discussed several patterns to update some parts of the data. I hope they are helpful to you!
Comments