Data Science Platforms Help the Buy Side Integrate Alternative Data

October 4, 2018 | By: Ivy Schmerken

alternative data

By Ivy Schmerken, Editorial Director

Asset managers are looking for ways to mine alternative data sets for investment ideas, recognizing that stock pickers cannot rely on traditional research.

An explosion in alternative data, ranging from satellite images to mobile geolocation data and unstructured text, has created an arms race among hedge funds and quantitative funds. Many firms have hired data scientists to dive into pools of data and to use machine learning algorithms to extract insights or predictive signals. Traditional buy-side firms are also eyeing big data sets while wrestling with how to incorporate them into the investment process.

alternative data
Bill Stephenson

“Alt data is starting to become a little more mainstream in that asset managers realize they need to find new sources of alpha to gain an edge,” said Bill Stephenson, managing director of AIR Summit, a conference that was held Sept. 5-6 in New York City.

The two-day event, which stands for Alpha Innovation Required, focused on facilitating discussions on challenging and innovative trends impacting the institutional investment management industry, especially active managers. AIR Summit 4.0 enabled buy-side firms to connect with startups in the “alpha technology” or alpha tech space.

This year, 20 startups presented at AIR Summit 4.0 out of 400 candidates. On average, companies presenting were in business for four years with 14 hailing from New York, five from London, and one from Vancouver.

For example, Predata, a predictive analytics company, identifies online shifts in human behavior and anticipates global events through machine learning and alternative data. “It turns metadata into insights,” said Hazem Dawani, CEO, adding that “metadata is language agnostic across the world whether it’s in English, Swahili or German.”  Predata collects meta data daily from tens of thousands of web sources, and then organizes the data into categories within countries and on topics related to security, social and political unrest, and economics.

When the Trump Administration in April of 2018 imposed unexpectedly harsh sanctions on Russia, the ruble dropped more than 8% over four days. Predata had signals predicting the possibility of new US sanctions and these signals peaked six days ahead of the currency’s move.  Customers receive early warnings on average 14 days in advance of events.

Recent signals included perceived risks that the NAFTA negotiation will fail.  On Sept. 18, it produced a US midterm signal observer about online activity on issues related to the November elections.

alternative data
Hazem Dawani

“Geopolitical risk moves financial markets and most of the time this is not priced into the currencies or the assets,” said Dawani, whose customers include government agencies, global macro risk managers, hedge funds and prop traders.

Increasing pressure from low-cost index funds has been cited as a factor in the buy-side’s search for an edge in alt data.

“It seems to me there is going to be this massive disruption in the coming years, significant consolidation, clear winners and losers in our business,” said Stephenson.  “I think there’s an opportunity to reinvent our business,” asserts Stephenson.

“As more assets move to passive, there’s going to be an opportunity for those who survive on the active side to outperform,” he predicted.

The buy side is expected to double its spend on alternative data from an estimated $661 billion in 2016 to $1 billion in 2020, based on analysis from

Yet, hedge funds and other quant funds have proved to be more nimble than larger asset managers at incorporating these feeds, said Stephenson.

“While all firms are looking at it, they haven’t found an easy way to back test it or integrate it into their process, which has been a challenge,” said Stephenson.

On top of that, hedge funds seem willing to pay for alternative data. Good alt data is not cheap, he said. With commissions dropping, the buy side may have difficulty paying for alt data.

What’s more, a lot of the biggest asset managers haven’t hired data scientists and they still haven’t hired proper data folks to manage all the data internally,” he said.

“There are still challenges internally for firms to manage their own data, let alone bring in outside alternative data,” continued Stephenson.  “It seems like some of the larger firms have not embraced it,” noted Stephenson.

Overcoming Obstacles: Data Wrangling

Wall Street’s demand for complex data sets has fueled the rise of ‘data as a service platforms’ that focus on “data wrangling” — the grunt work of ingesting, cleaning, normalizing and reformatting alt data feeds.  Rather than replicate the efforts across Wall Street of building their own data infrastructure, major banks and even hedge funds are investing in startups and grooming these firms to customize their solutions in ways that address the non-differentiating data prep work. This frees up the data scientists to do the more meaningful work of extracting insights from data, Two Sigma’s Ali-Milan Nekmouche, chief data strategist at the hedge fund, told Bloomberg.

On Sept. 6, Two Sigma Investments, a quantitative hedge fund, announced it made a minority investment in Crux Informatics, a startup that focuses on the data-engineering platform-as-a-service. Crux specializes in the drudgery of cleaning up, transforming and reformatting the giant data sets, which must be done before firms can engage in the data science part.

Two Sigma joined Goldman Sachs and Citi, two of Crux’s other investors, in forming a strategy partnership to help grow the platform to meet the data-hungry demands of the financial industry.

alternative data
Philip Brittan

“We find that a lot of people are not quite aware of how much effort, how much time, energy and money goes into data engineering before you can start building a predictable model on top of that,” said Philip Brittan, CEO of Crux Informatics. Firms spend 80% of their time on the data engineering part which is ingesting, structuring, storing, and physically manipulating data., and only 20% is spent on the data science part, “which is the competitive part of looking at the data, pulling value out of it, doing research, coming up with theories, building models, finding signals and generating returns,” said Brittan.

Data Science as a Platform

The rise of data science platforms is also helping investment managers to work with the alternative sets and incorporate them into their process.

“Asset managers don’t usually like to change unless they have to, [but} there is more competitive pressure than ever before,” said Zac Sheffer, co-founder and CEO of Elsen Systems, who was at the conference.

Elsen provides a platform-as-a service for financial institutions to harness big data to make better decisions and quickly solve complex problems.  Sheffer said that experts are spending 70% of their time getting data in order. Adding to the problem is that data sets tend to be siloed in large financial organizations.  That’s where Elsen comes in. Its platform puts all the data from different sources in a normalized database and makes it easier for firms to bring in proprietary data. To boost performance, Sheffer said Elsen wrote its own programming language, which runs 50 times faster while reducing the amount of code by 50 times as well.

One large $700 billion asset manager replaced the firms ‘s entire equity research system with Elsen’s platform to run everything from data management to infrastructure and analytics.  “Because the language is so much faster, portfolio managers and analysts can use the UI to get access to more data than ever before,” said Sheffer.

Clients are utilizing the Elsen Platform for alpha generation, testing out ideas, and portfolio construction.  Elsen also works with operational teams to better understand the data they are using. “When teams work in silos they end up paying for the same data multiple times, and they can get rid of that.” They can also bring in alternative data sets through a custom data tab. Many clients on the systematic side will test hundreds if not thousands of alternative data sets, said Sheffer.  “We had firms test 200 alt data vendors, and found that 95% of them were pretty good and they were able to get that answer in under a day,” he said.

For those on the buy or sell side who lack a background in data science or machine learning, startups are providing platforms and tools to simplify data formatting and conducting analysis.

“While firms have read about exciting data sets, when you bring that data into a firm and hand it to an analyst, they don’t know what to make of it” said Brad Schneider, CEO of Adaptive Management, an analytics environment.  Adaptive helps discretionary fundamental investors take advantage of alternative data, and is working with high profile firms ranging from hedge funds to mutual funds and sovereign wealth funds.

The firm’s web-based platform, dubbed Data Monster, allows firms to access all the data they’ve purchased in one place, then click a few buttons to do productive analysis. “We can bring in credit card data, or foot traffic or web data or even my own custom data and turn that into an estimate of how a company is doing with regard to sales, margin or a macro indicator,” said Schneider, a former fundamental analyst and portfolio manager.

One of the complexities for anyone plunging into the world of data science is familiarity with statistical analysis. For discretionary fundamental investors who may not have taken a statistics course for 20 or 30 years, “this is a new idea,” said Schneider. While Wall Street is accustomed to a point estimate, data science lends itself to a distribution of outcomes.  “So, Data Monster will churn that data into ranges that they can mouse over to see the likelihood of whether the company is going to fall inside or outside of this range,” he said.   This also removes the manual process of doing this data analysis. Instead of manually having to wrangle all of the data sets, as data comes into the platform, Schneider said, the models and analysis are automatically applied.  Thresholds can be set to notify portfolio managers of changes in factors such as sales or headcount, growing above or falling below a certain amount.

While investment managers are in the early stages of adopting alternative data, the rise of data science platforms is another step toward mainstreaming the technology.

But, investment firms integrating alternative data will need a visionary at the top to drive it, said Stephenson.

alternative data
Ivy Schmerken

Data Service Integrations Available with FlexTrade’s Execution Management System Technology

For further information, please contact us at

 Past blog posts related to the use of data