Salesforce Open Sources an Engine to Automate ML Model Building

Customer relationship management service provider Salesforce has released as open source the automated model-building engine that the company uses for its Einstein AI-driven platform.

The TransmogrifAI library can be used to build highly-automated machine learning workflows that run on Apache Spark, using the relational data that all organizations keep on hand. TransmogrifAI (pronounced trans-mog-ri-phi) addresses one of the major challenges of setting up machine learning (ML) in production settings, that of establishing a workflow for quickly developing and testing models, which can be used to predict future outcomes.

“It automates the entire machine learning workflow. You can build a good machine learning model on a given dataset in a couple hours, instead of weeks or months,” Shubha Nabar, Salesforce’s senior director of data science for Einstein. “It significantly reduces the time and expertise to use machine learning.”

Salesforce first developed the library for its own Einstein AI-as-a-service, to help customers customer churn, sales forecasts, lead conversions, equipment failures, and late payments. Because each customer’s needs and data was different — not to mention private — the company had to build and deploy thousands of machine learning models on a case-by-case basis.

“Every customer’s data is so different — different schemas, different shapes, different biases that are introduced by their processes,” Nabar said. “For a machine learning model to do a good job, it has to be built on the customer’s data. But at Salesforce-scale, if you have to build a model for every single use case, it just doesn’t scale.”

So the company built automation tools to do as much of the grunt work as possible. Today, models built with the library power over three billion predictions a day.

Given structured data from a relational database, TransmogrifAI can work through the steps of producing a model that can be used to make predictions about future behavior. This includes data preparation (“feature inference’), converting data into numerical representation (the “transmogrification” or “feature engineering”), and removing any data with no predictive power (“feature validation”).

Finally, the software runs several different machine learning algorithms (or “models”) on the data and picks the best one, offering a summary of each algorithm’s performance. The software also offers hyperparameter optimization, or the ability to tune the algorithms’ relative reliance on variables for best performance (A blog post from Nabar explains each of these steps in greater detail).

Besides the benefit of tying all of these tasks together in the same package, TransmogrifAI also could save considerable time and even open up ML work for those organizations that may not have an in-house model-building expertise, Nabar explained. Currently, there appear to be many more data scientist job openings than bodies to fill them. Data scientist may very well be the hottest job in today’s market — IBM has predicted that we’ll see 2,720,000 job listings for data scientists by 2020, up from 364,000 today.

Written in Scala, TransmogrifAI builds on top of Spark ML Pipelines, using Transformers and Estimators abstractions for transforming DataFrames, as well as its own DataFrame abstraction, called Features. Features is a type-safe pointer to a column in a DataFrame with all the information about that column, allowing developers to define, work with, and share features in much the same way they’d work with standard variables.

The software is available under a BSD-3 Clause, and the company welcomes outside contributions to further refine the library.

Feature image: Joshua Sortino on Unsplash.

The post Salesforce Open Sources an Engine to Automate ML Model Building appeared first on The New Stack.

Salesforce Open Sources an Engine to Automate ML Model Building

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Suspected burglar to know fate in January

[MP3] Texzy Ft Dr. Ritzy –“Leg Over” (Prod. @DrRitzy & @KezzyKlef)

18A St. Fintan's Villas, Deansgrange, Co. Dublin - €365,000

God of war 3 PPSSPP Download For Android 1.3 GB

99 God Status for Whatsapp, Facebook

M23 northbound reopened after lorry fire causes chaos

Not much punishment for substantial benefit fraud

Sarangapur Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers List...

Change text color of a mushroom title card

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

Who Is Sisanda Jonas? | Biography| Profile| History Of South African Media...

Aaron Haywood – Hyde

Error 0x80070299 copying file to ReFS

बिना कपड़े उतारे भी लें सकते हैं सेक्स का मज़ा, ट्राई करें ये नया तरीकाबिना...

Breaking Down Bumpy’s Boys: NYC Black Mob Boss Of Old Surrounded Himself With...

Attharintiki Daaredhi: Bappu Gari Bommo Lyrics Translation

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Cattivissimo.Me.3.2017.iTALiAN.MD.WEBDL.XviD-iSTANCE Seed (318)/Leech (148)