BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
Microsoft R is Microsoft's enhanced implementation of R, an open source development language for statistical analytics commonly used by data scientists, statisticians and academia. Microsoft's family of R products includes Microsoft R Open, Microsoft R Client and Microsoft R Server.
Microsoft R Open, formerly known as Revolution R Open, is an enhanced version of R that includes a high-performance R language engine. It comes with the Reproducible R Toolkit, which ensures that results of R code executions are repeatable over time, and that others who run the same code will achieve precisely the same results. Microsoft R Open is free to download, use and share.
Microsoft R Open operates in host execution environments, which include a variety of Hadoop frameworks, such as Cloudera, Hortonworks and MapR. Microsoft R Open can also operate in enterprise data warehouse platforms, such as Teradata or IBM, and on compute grids, such as Microsoft and IBM.
Microsoft R Client and R Server are both built on Microsoft R Open. They introduce Microsoft's proprietary ScaleR technology, a comprehensive library of big data analytics algorithms that support parallelization of computations and data analysis.
Microsoft R Client is a free, high-performance analytics tool that enables users to perform data analytics using parallel processing via the ScaleR technology. Microsoft R Client has some limitations, in that it requires that data be processed to fit in the memory of the local client, and it can only process two threads when using several ScaleR functions.
Microsoft R Server is built on Microsoft R Open and offers an enhanced version of R that provides enterprise-grade performance and scalability. It can run R scripts and Comprehensive R Archive Network, or CRAN, packages via clustered parallel processing. Microsoft R Server removes the limitations of Microsoft R Client by using disk scalability to enable users to perform analytics on amounts of data larger than the amount of memory of the server.
ScaleR algorithms available with Microsoft R Server reduce memory limitations, as they're implemented as optimized parallel external memory algorithms, which manage available RAM and storage together, resulting in increased scalability for analytics processing. With the tools of ScaleR, developers don't need special development methods or languages to enable parallel processing.
Microsoft R Client and Server support data preparation, descriptive statistical functions, data visualization, statistical test functions, classification and machine learning capabilities, as well as parallelized statistical modeling algorithms. In addition, ScaleR with R Server offers options for open database connectivity drivers and other connector capabilities, enabling integration with several database systems, including the Hadoop Distributed File System.
Microsoft also offers SQL Server R Services, which is an R installation that runs alongside SQL Server, providing users with a mechanism for integrating Microsoft R Server data with SQL Server and Microsoft's other business intelligence tools.
The most current version of R Open can be used as open source under the GNU General Public License version 2. Microsoft R Open runs on Windows, macOS 10.9 and above, and several versions of Linux, with community forum support available. The latest version of the Microsoft R Client can be downloaded for use on Windows.
Microsoft R Server is available as a commercial product, and it can be used on Windows and Linux, as well as with Hadoop, Teradata and SQL Server. Microsoft R Server can be downloaded from several channels, including with a Microsoft Developer Network subscription, the Volume Licensing Service Center and via Visual Studio Dev Essentials.
Developers and data scientists can install the free developer edition, which delivers the same features as the enterprise version, but is intended for developer purposes. Contact Microsoft for pricing and support options.
Using the R analytics language with SQL Server
Predictive analytics applications may require more data prep than conventional analytics