<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Functional Programming on rdata.lu Blog | Data science with R</title>
    <link>/tags/functional-programming/</link>
    <description>Recent content in Functional Programming on rdata.lu Blog | Data science with R</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <copyright>Copyright (c) rdata.lu. All rights reserved. &lt;br&gt; Content reblogged by &lt;a href=&#39;https://www.r-bloggers.com/&#39; target=&#39;_blank&#39;&gt;R-bloggers&lt;/a&gt; &amp; &lt;a href=&#39;http://www.rweekly.org/&#39; target=&#39;_blank&#39;&gt;RWeekly&lt;/a&gt;</copyright>
    <lastBuildDate>Tue, 14 Nov 2017 00:00:00 +0000</lastBuildDate>
    
        <atom:link href="/tags/functional-programming/index.xml" rel="self" type="application/rss+xml" />
    
    
    <item>
      <title>Functional peace of mind</title>
      <link>/post/2017-11-14-functional-peace-of-mind/</link>
      <pubDate>Tue, 14 Nov 2017 00:00:00 +0000</pubDate>
      
      <guid>/post/2017-11-14-functional-peace-of-mind/</guid>
      <description>&lt;p&gt;I think what I enjoy the most about functional programming is the peace of mind that comes with it. With functional programming, there’s a lot of stuff you don’t need to think about. You can write functions that are general enough so that they solve a variety of problems. For example, imagine for a second that R does not have the &lt;code&gt;sum()&lt;/code&gt; function anymore. If you want to compute the sum of, say, the first 100 integers, you could write a loop that would do that for you:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;numbers = 0

for (i in 1:100){
  numbers = numbers + i
}

print(numbers)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 5050&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The problem with this approach, is that you cannot reuse any of the code there, even if you put it inside a function. For instance, what if you want to merge 4 datasets together? You would need something like this:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(dplyr)
data(mtcars)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;mtcars1 = mtcars %&amp;gt;%
  mutate(id = &amp;quot;1&amp;quot;)

mtcars2 = mtcars %&amp;gt;%
  mutate(id = &amp;quot;2&amp;quot;)

mtcars3 = mtcars %&amp;gt;%
  mutate(id = &amp;quot;3&amp;quot;)

mtcars4 = mtcars %&amp;gt;%
  mutate(id = &amp;quot;4&amp;quot;)

datasets = list(mtcars1, mtcars2, mtcars3, mtcars4)

temp = datasets[[1]]

for(i in 1:3){
  temp = full_join(temp, datasets[[i+1]])
}&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Joining, by = c(&amp;quot;mpg&amp;quot;, &amp;quot;cyl&amp;quot;, &amp;quot;disp&amp;quot;, &amp;quot;hp&amp;quot;, &amp;quot;drat&amp;quot;, &amp;quot;wt&amp;quot;, &amp;quot;qsec&amp;quot;, &amp;quot;vs&amp;quot;, &amp;quot;am&amp;quot;, &amp;quot;gear&amp;quot;, &amp;quot;carb&amp;quot;, &amp;quot;id&amp;quot;)
## Joining, by = c(&amp;quot;mpg&amp;quot;, &amp;quot;cyl&amp;quot;, &amp;quot;disp&amp;quot;, &amp;quot;hp&amp;quot;, &amp;quot;drat&amp;quot;, &amp;quot;wt&amp;quot;, &amp;quot;qsec&amp;quot;, &amp;quot;vs&amp;quot;, &amp;quot;am&amp;quot;, &amp;quot;gear&amp;quot;, &amp;quot;carb&amp;quot;, &amp;quot;id&amp;quot;)
## Joining, by = c(&amp;quot;mpg&amp;quot;, &amp;quot;cyl&amp;quot;, &amp;quot;disp&amp;quot;, &amp;quot;hp&amp;quot;, &amp;quot;drat&amp;quot;, &amp;quot;wt&amp;quot;, &amp;quot;qsec&amp;quot;, &amp;quot;vs&amp;quot;, &amp;quot;am&amp;quot;, &amp;quot;gear&amp;quot;, &amp;quot;carb&amp;quot;, &amp;quot;id&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;glimpse(temp)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Observations: 128
## Variables: 12
## $ mpg  &amp;lt;dbl&amp;gt; 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19....
## $ cyl  &amp;lt;dbl&amp;gt; 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, ...
## $ disp &amp;lt;dbl&amp;gt; 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 1...
## $ hp   &amp;lt;dbl&amp;gt; 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, ...
## $ drat &amp;lt;dbl&amp;gt; 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.9...
## $ wt   &amp;lt;dbl&amp;gt; 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3...
## $ qsec &amp;lt;dbl&amp;gt; 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 2...
## $ vs   &amp;lt;dbl&amp;gt; 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, ...
## $ am   &amp;lt;dbl&amp;gt; 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, ...
## $ gear &amp;lt;dbl&amp;gt; 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, ...
## $ carb &amp;lt;dbl&amp;gt; 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, ...
## $ id   &amp;lt;chr&amp;gt; &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Of course, the logic is very similar as before, but you need to think carefully about the structure holding your elements (which can be numbers, datasets, characters, etc…) as well as be careful about indexing correctly… and depending on the type of objects you are working on, you might need to tweak the code further.&lt;/p&gt;
&lt;p&gt;How would a functional programming approach make this easier? Of course, you could use &lt;code&gt;purrr::reduce()&lt;/code&gt; to solve these problems. However, since I assumed that &lt;code&gt;sum()&lt;/code&gt; does not exist, I will also assume that &lt;code&gt;purrr::reduce()&lt;/code&gt; does not exist either and write my own, clumsy implementation. Here’s the code:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;my_reduce = function(a_list, a_func, init = NULL, ...){

  if(is.null(init)){
    init = `[[`(a_list, 1)
    a_list = tail(a_list, -1)
  }

  car = `[[`(a_list, 1)
  cdr = tail(a_list, -1)
  init = a_func(init, car, ...)

  if(length(cdr) != 0){
    my_reduce(cdr, a_func, init, ...)
  }
  else {
    init
  }
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This can look much more complicated than before, but the idea is quite simple; &lt;em&gt;if you know about recursive functions&lt;/em&gt; (recursive functions are functions that call themselves). I won’t explain how the function works, because it is not the main point of the article (but if you’re curious, I encourage you to play around with it). The point is that now, I can do the following:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;my_reduce(list(1,2,3,4,5), `+`)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 15&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;my_reduce(datasets, full_join) %&amp;gt;% glimpse&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Joining, by = c(&amp;quot;mpg&amp;quot;, &amp;quot;cyl&amp;quot;, &amp;quot;disp&amp;quot;, &amp;quot;hp&amp;quot;, &amp;quot;drat&amp;quot;, &amp;quot;wt&amp;quot;, &amp;quot;qsec&amp;quot;, &amp;quot;vs&amp;quot;, &amp;quot;am&amp;quot;, &amp;quot;gear&amp;quot;, &amp;quot;carb&amp;quot;, &amp;quot;id&amp;quot;)
## Joining, by = c(&amp;quot;mpg&amp;quot;, &amp;quot;cyl&amp;quot;, &amp;quot;disp&amp;quot;, &amp;quot;hp&amp;quot;, &amp;quot;drat&amp;quot;, &amp;quot;wt&amp;quot;, &amp;quot;qsec&amp;quot;, &amp;quot;vs&amp;quot;, &amp;quot;am&amp;quot;, &amp;quot;gear&amp;quot;, &amp;quot;carb&amp;quot;, &amp;quot;id&amp;quot;)
## Joining, by = c(&amp;quot;mpg&amp;quot;, &amp;quot;cyl&amp;quot;, &amp;quot;disp&amp;quot;, &amp;quot;hp&amp;quot;, &amp;quot;drat&amp;quot;, &amp;quot;wt&amp;quot;, &amp;quot;qsec&amp;quot;, &amp;quot;vs&amp;quot;, &amp;quot;am&amp;quot;, &amp;quot;gear&amp;quot;, &amp;quot;carb&amp;quot;, &amp;quot;id&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Observations: 128
## Variables: 12
## $ mpg  &amp;lt;dbl&amp;gt; 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19....
## $ cyl  &amp;lt;dbl&amp;gt; 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, ...
## $ disp &amp;lt;dbl&amp;gt; 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 1...
## $ hp   &amp;lt;dbl&amp;gt; 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, ...
## $ drat &amp;lt;dbl&amp;gt; 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.9...
## $ wt   &amp;lt;dbl&amp;gt; 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3...
## $ qsec &amp;lt;dbl&amp;gt; 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 2...
## $ vs   &amp;lt;dbl&amp;gt; 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, ...
## $ am   &amp;lt;dbl&amp;gt; 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, ...
## $ gear &amp;lt;dbl&amp;gt; 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, ...
## $ carb &amp;lt;dbl&amp;gt; 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, ...
## $ id   &amp;lt;chr&amp;gt; &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1&amp;quot;, &amp;quot;1...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And if I need to merge another dataset, I don’t need to change anything at all. Plus, because &lt;code&gt;my_reduce()&lt;/code&gt; is very general, I can even use it for situation I didn’t write it for in the first place:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;my_reduce(list(&amp;quot;a&amp;quot;, &amp;quot;b&amp;quot;, &amp;quot;c&amp;quot;, &amp;quot;d&amp;quot;, &amp;quot;e&amp;quot;), paste)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;a b c d e&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Of course, &lt;code&gt;paste()&lt;/code&gt; is vectorized, so you could just as well do &lt;code&gt;paste(1, 2, 3, 4, 5)&lt;/code&gt;, but again, I want to insist on the fact that writing or using such functions allows you to abstract over a lot of thing. There is nothing specific to any type of object in &lt;code&gt;my_reduce()&lt;/code&gt;, whereas the loops have to be tailored for the kind of object you’re working with. As long as the &lt;code&gt;a_func&lt;/code&gt; argument is a binary operator that combines the elements inside &lt;code&gt;a_list&lt;/code&gt;, it’s going to work. And I don’t need to think about indexing, about having temporary variables or thinking about the structure that will hold my results.&lt;/p&gt;
&lt;p&gt;
Don’t hesitate to follow us on twitter &lt;a href=&#34;https://twitter.com/rdata_lu&#34; target=&#34;_blank&#34;&gt;&lt;span class=&#34;citation&#34;&gt;@rdata_lu&lt;/span&gt;&lt;/a&gt; &lt;!-- or &lt;a href=&#34;https://twitter.com/brodriguesco&#34;&gt;@brodriguesco&lt;/a&gt; --&gt; and to &lt;a href=&#34;https://www.youtube.com/channel/UCbazvBnJd7CJ4WnTL6BI6qw?sub_confirmation=1&#34; target=&#34;_blank&#34;&gt;subscribe&lt;/a&gt; to our youtube channel. &lt;br&gt; You can also contact us if you have any comments or suggestions. See you for the next post!
&lt;/p&gt;
</description>
    </item>
    
  </channel>
</rss>
