May 16, 2020 34 min read

How to use external data to make better underwriting decisions

Written By CAPE Analytics

New data sources for underwriting property insurance

This is a transcript of a webinar that explores how insurers and reinsurers are using external data to make better decisions, enhance risk mitigation, add real value to customers, and build better in-house models.

Just want the key takeaways? View our report here.

The panel:
  • Rob Galbraith, author of “The End of Insurance as We Know It”
  • Matt Junge, SVP property solutions, USA & Canada, Swiss Re
  • Ryan Kottenstette, Chief executive officer, Cape Analytics (moderator)
  • Rick McCathron, Chief insurance officer, Hippo Insurance
  • Barney Schauble, head of Nephila Labs and chairman of Nephila Climate

According to Intelligent Insurer’s survey, held online in the run-up to this event, about 76 percent of underwriters felt external data is every bit as important as internal data. That was the starting point for this discussion which saw five senior executives from the insurance industry debate just how companies can better use data to make better underwriting decisions, add value to their customers and build better in-house models.

The discussion started around the different types of data available, how this has changed, and how the industry is adjusting to how it can better leverage this for its customers. But it quickly moved into the ability of different types of insurers, from start-ups to incumbent players, to leverage this data to add value to their customers and make better decisions generally.

The panel generally agreed that the availability of such data represents a revolution for parts of the industry but only companies with the skillsets and will to successfully leverage this data, use it and combine it with other forms of data successfully will be the eventual winners.

Here, Ryan Kottenstette, Chief executive officer, Cape Analytics, chaired the debate that was also steered by input and questions from a global audience listing live online.


Ryan Kottenstette: What new forms of data have you seen becoming available, and have been the highest impacts so far?


Barney Schauble: I would say three things about data and impact. First, it’s not just about new data, although new data is important. In many cases it’s also about verification of, or completion of, existing data.

If you look at the typical kinds of information you receive when you’re writing a property insurance or reinsurance, it’s often incomplete, or dated, and through no fault of the specific cedants or intermediaries, because that information is hard to update, particularly at scale.

If you visited a house, you could get all the data you want, but it’s very difficult to do that for hundreds of thousands of individual homes. It’s partly about the scale of access to accurate data.

Second, it’s about augmenting that information with new data. You can see some things more efficiently through satellite imagery—for example, what is the condition of the roof, or find information that’s not typically visible to the human eye.

It’s a combination of those two things, the new and the verification of the old. That’s useful from our standpoint as top-down analysis, if you’re looking at a book or a portfolio of many different buildings, and most of our core business is reinsurance.

It’s useful to get an understanding of how a portfolio’s data compares to your expectation, and how those portfolios compare to one another. It’s also useful from a bottom-up standpoint if you’re looking at an individual house, or an individual commercial building—that data can give much more insight into how that property’s going to behave.

But the highest impact is the longitudinal data over time. If you now have a more complete and more accurate dataset with more variables than you traditionally had, you are able to see how that evolves, how individual companies’ portfolios evolve. Most importantly, how those buildings then respond in events that actually occur gives the most powerful conclusion.


I would also note that an insurance company portfolio is not a static thing, it’s evolving every day. Once you have information at a much higher level of that solution, it’s much easier to manipulate it to scale. If you can look at how the portfolio is evolving over time, that gives you much more information about that company and its underwriting habits and practices.

Are they migrating from risk that you find attractive to risk that seems a little bit further out on a limb, or are they migrated in a portfolio in ways which make sense as they themselves learn in their primary underwriting practices?


Kottenstette: How might expectations be evolving around the speed of decision-making, and does third-party data have a role to play?


Rick McCathron: We are certainly seeing that. Customer expectations, whether you’re in insurance or anything else, have shifted dramatically, predominantly because of the ease of doing business that companies such as Amazon have created. We don’t live in a siloed world, we live in a dynamic world. And when one industry is shifting customer expectations, the rest of us have to follow suit.

From our perspective, data solves that problem. With data, we can get the answers to 70 traditional underwriting questions in a matter of seconds. That does two things: first, of course, it makes it easier for the user. The user is not asked several dozen questions that they are likely not to know the answer to.

When we’re getting ready to move into a new home, we don’t know the age of the roof. Somebody is going to say, ‘how old is that roof?’, and your answer is going to be something along the lines of, ‘I don’t know, I didn’t live here when the roof was replaced’. Or ‘how far are you from a fire hydrant?’. Or ‘what is the inner wall material of your house?’.

All the questions that traditional insurance companies ask customers are good questions, but the customer doesn’t have the answers, yet they’re required to give an answer. Data solves that problem, because data quickly allows us to get verifiable third-party information, that we know to be accurate, without the need to ask the customer questions they don’t have an answer for.

On the flipside, the same solution for customer expectations of ease and speed helps us with underwriting integrity. Back to that roof question: what if it’s 12 or 15 years, or two years? If we can get that information from verifiable third-party data sources, and we know it to be true, we can price the risk more accurately.

It’s truly a win-win—from an underwriting perspective, and certainly from matching the consumer’s expectations for a quick and seamless transaction.


Kottenstette: Given the rapidly increasing availability of third-party information, how do insurers take advantage of that for internal workflows?


Matt Junge: It’s exciting and challenging at the same time—extremely exciting in the possibilities that we have and challenging in picking what data we use, how [underwriters must] balance the expense of … data with the payoffs we get and the ease of doing business for the customer.

What is the perfect dataset? We can break down the implications into a few different buckets: coverage we can offer; how we do the rating; and how the underwriting process is actually completed.




As to rating, we’ve got dozens of new data sources but maybe it’s an opportunity to simplify our rating process, because now we don’t have to worry about finding data sources that are proxies for the actual risk driver or loss driver.

We have the data available now that shows the loss driver, and we can use that. That makes our rating more transparent, it simplifies it, it allows us to have better pricing and offer maybe more affordable coverage to folks who are lower drivers of risk.

We talk about a lot more data, but it doesn’t mean it has to be more complex. It could mean we’re simplifying things and making the models more transparent, which is going to allow buy-in from all the stakeholders along the way. Not only us as insurers—also agents, the insured policyholders, and then regulators.




As to its impact on the underwriter, and the underwriting workflow, it allows us the opportunity for real-time assessment and remote monitoring that we haven’t had before. It’s working, but there’s so much further we can go.

How do we validate the data we’re getting? Maybe it’s not about new data but validation of the existing data.

I come from a property underwriting background, and inspections are a big expense, and also something that doesn’t usually happen immediately. Now we can do inspections in real time with high quality visual data imagery, and geospatial data.

All this is leading to improved triage, instant binding for a better experience for both the insurer and the insured. It should allow more flexible product offerings as well.

From my experience, regulators understand how the data is evolving, but we might think of regulators as being a roadblock. I’ve had the opposite experience—if we can make things more transparent, and perhaps even simpler, the regulators can be a driver for positive change in this area.


Kottenstette: How are regulators adapting to these changes?


Junge: It is going to depend on the state and on the product, so I would say it’s an evolving view, but encouraging.


McCathron: We had the opportunity to meet with insurance commissioners and deputy commissioners over the last year. Everybody was excited about making sure that the way insurance companies look at things is evolutionary.

They like the fact that things are changing, and they want to make sure it’s compliant, but they can help facilitate change within their departments.


Kottenstette: What is your view of how these new data sources are evolving the industry?


Rob Galbraith: It’s been an explosion, in terms of the number of companies that are offering data to insurance agents, brokers and carriers, as well as better quality, and the sources. There’s a huge variety in the number of sources, particularly with unstructured data. In the past the third-party data was limited to a lot of government-type sources.

You might get building permits, or department of motor vehicle records for accidents and speeding tickets and the like. And there were some large firms selling into the insurance industry for many decades.

But especially in the last five there’s been an explosion in the number of companies that are selling data and now the third-party data is better quality, it’s more reliable than just asking the homeowner what is the liveable square footage of your house, or the age of your roof.

Some data sources are new, such as Insight, it’s another IoT-enabled sensor, but TelemaxX has been around for almost two decades in some variety, it’s changed of course, and been refined.

We were talking about COVID-19, a lot of processes in the past required what is referred to as ‘wind shield time’—somebody has to drive out to a particular home, or building, commercial building location and take some photos, go inside and write a report.

They’re able now just to blink and have somebody on the inside and walk around the property and take video footage; the underwriter can often take still images from the video.

A lot of these workflows are being rapidly reworked because of this era of social distancing. We’re seeing a big productivity lift because we’re reducing that wind shield time, and I think some of these workflows are here to stay—they’ll be the new normal, we’re not going to go back to the old way of doing things once we are able to come together again in public.

The other thing I’ll mention is that these third-party data vendors in many cases are not just capturing this data and making it available, they’re curating it, they’re using artificial intelligence (AI) to put it in context, eg, geospatial location, so I can see that the roof is in a good or poor condition. The AI and machine learning and other techniques can already give me an indication of that.

In the future we’re going to see a lot more of this information fed into a single cloud-based platform for underwriters to be able to leverage, whereas today they’re having to do a lot of supplemental data gathering on their own, doing internet and social media searches, etc.

Longer-term there will be blockchain solutions where a lot of this information will be available on a blockchain in a secure way, because data privacy and security are a concern.


Kottenstette: How might a carrier’s approach to this depend on whether they are a new carrier versus incumbent?


McCathron: There is one massive advantage for new insurers into the market, and that’s the availability of the third-party data. If you go back historically, the only data available was carrier legacy data from decades of policies, claims experience and underwriting results, and they could use their own data as an advantage that new entrants to the industry didn’t have.

With the availability of so many different sources of third-party data, often that data overlaps so you may get coverage of only 80 percent for one particular source for one particular data element. But sure enough, a week or two goes by and another data source has presented itself to cover the gaps from the first.

It has created a level playing field where one could argue that the newer players in the space now have an advantage because they are not saddled with legacy data that you hope is correct.

I can’t tell you how many times back in my auto insurance days when insurance companies were trying to have a de facto rate change, and they would tell their agents, just put everything as pleasure use: 5,000 miles. That skews the data over time, and it impacts your ability to price risk correctly.

If you look at a history of 40 or 50 years, you can imagine all the different data elements that had been corrupted within your legacy book of business. However, the new data sources are relatively pure, and as long as you can get enough of them to provide ample coverage that you have a high degree of trust in, that’s an advantage for the newer players like Hippo.


Galbraith: I agree in terms of the disadvantage now being an advantage, but in the past there was a lot less external data available, so most of the data that was available was through those direct questions asked by agents of insurers and captured by the carriers. The larger the carrier the bigger the book of business, because they had more internal data and that was a competitive advantage.

Now, with the availability of external data, it certainly helps the startups, and I would also say that because of those legacy systems, it is very challenging for more traditional carriers to find ways to integrate their external and internal data together to get a complete picture.

It could be a powerful advantage, wrangling that data, managing it, bringing it together, and to provide the context and the 360-degree view of the individual risk. Being able to do that at scale causes a lot of complicated challenges, if your tech stack isn’t up to modern standards or if you’re stuck back in a 1980s mainframe world.


Junge: From what I see with the legacy insurers we work with, people may have a bias towards their own legacy data, and with all these new startup data providers, that data is either unproven, or wrong.

That’s what a lot of legacy insurers have to struggle with—whose data do we believe? It’s easy to rely on your own data, but are we managing it the right way, are we analysing it the right way? That’s just one more thing for legacy insurers to grapple with.


Schauble: I’m not sure that having a new software system versus an old software system means that you’re going to be a better user of data to make decisions.

Reinsurance and insurance are in the business of dealing with uncertainty. When you’re trying to make a decision around ‘is this a good risk?’, and ‘how should you price a specific risk?’, the more you can whittle down sources of uncertainty, the better decisions you can make, and also the more efficient pricing signal you’re going to get.

For example, legacy carriers have to make assumptions such as ‘if I don’t have specific information about this building, I’ll just assume that it is by default the average of other nearby buildings’.

The sort of noise that gets introduced into the process isn’t helpful to anybody, it’s not helpful to buyers, because there’s no accurate information about their risk, and it’s not helpful to sellers whether they’re insurance or reinsurance companies.

It’s less about how shiny your software system is, and more about if you can put the entire industry on a slightly firmer footing, you have more of a foundation around accuracy, that should lead to better risk selection and more effective pricing, and more effective portfolio construction.

A lot of it is about removing uncertainty, and different people will use that in different ways.


McCathron: I agree that the system doesn’t necessarily make for good handling of data, but the system allows you to handle the data in a more timely fashion. So if you have a nimble system that can plug in multiple data sources within a week, and you’re starting to integrate those data sources with your book of business in a matter of a few weeks, as opposed to six or 18 months that some of the legacy carriers are hamstrung with because their systems are not as nimble, ultimately you’d probably get the same information, but that timeliness and speed to market with the ever-evolving data sources would create an advantage.


Kottenstette: It’s a potential advantage, but you have to use it properly. From our vantage point, we sell alternative information, risk signal to reinsurers, primary carriers, new carriers as well. I would echo a lot of what I’ve heard here in terms of the advantages of being a new entrant.

But the other side of the coin is that some of the largest legacy carriers have very deep proprietary loss history information.


Galbraith: I completely agree with that. One of the hardest things for startups is pricing the risk, that needs to be adjusted later on as people are identifying markets they find attractive. More often than not, I see the rates tend to go up after they’ve misjudged and mispriced the risk.

A claim is actually a rare occurrence. On any given day in your home or in your business, do you claim or not? The answer is almost always no—in fact you could build a pretty good model from a statistical standpoint if you predicted that people were never ever going to have a claim.

If you’ve got 5 percent claim frequency, you can be 95 percent accurate in saying that you’re not going to have a claim. But clearly the signal is there.

A claim doesn’t always correspond to a specific thing but there’s definitely a loose correlation between certain events and claims, but it’s not always as straightforward as we would like to think. Having said that, the increase in the use of IoT-based sensors, whether that be in a car, in a home, in a business—wearable devices—can provide what we call near-misses. You had a plumbing leak, but you grabbed a bunch of towels and you mopped it up before it damaged the floor, so you didn’t file a claim.

From an insurance standpoint that’s still an interesting thing. You had a leak, you reduced the damage this time, but did you actually address the underlying problem, did you make any adjustment in your plumbing?

There are ways to overcome that, but you’re right, there’s no substitute for having actual claims data, that’s very valuable.


Kottenstette: How do we see advances in personal lines compared to other areas such as professional liability or employee benefits?


Schauble: The approach makes sense. If you can get some of that information from a source where it’s validated and clear rather than you as the person filling out this form to apply for a direct office of returns or worker’s comp insurance.

If you can, for example, and some companies are working on this, link your information around the number of employees and revenue directly into an insurance company, or into some third party that’s validating that, and has that information and can provide it on a secure basis, then you can get a more accurate flow of information.

I do think that the theme applies to other lines of insurance, but we’re not specialists in those other lines.


Junge: Homeowners or personal lines do get more attention, maybe it’s about size, there are many more data sources available for personal lines, but if you look at commercial lines, one of the challenges for the industry is how do we determine our contingent business and our up chain coverage.

It’s still a challenge, but the way data sources are evolving and how the AI is working, that’s another area where we begin to see big improvements from where we were driven by new data.


Kottenstette: How do you manage ethical considerations from using third-party data? If you depend on a black box for these deliverables, the regulators could have concerns.


McCathron: We try not to have a black box as it relates to the regulators, in fact we probably err on being overly transparent with the regulators. We explain exactly what we’re trying to do, how we’re trying to do it in a collaborative way.

Most departments of insurance now have a regulatory sandbox that was designed for newer companies to work collaboratively with them. I don’t know that there’s a tremendous amount of proprietary information available in personal lines homeowners.

The trick is making sure you get all the information that’s available, and then making sure you take that data, as previously stated, and process it in an intelligent way—and not just to improve the user experience, because that’s the easy part.

Make sure you are pricing risks correctly, and have quality data science that shows how each individual data source interacts with your projected claims frequency, and claims severity.


Junge: Transparency is also important in terms of explainability to a customer, and to executives. If you’ve used one that has a black box approach versus one that has a more open approach, the differences in the lift you would get are not substantially different, but all thing being equal, you’d lean towards the more transparent approach.

And I know there are some regulators like New York Department of Financial Services has said, they want to see the components for that algorithm technology uses. I know there is some concern around sharing that as some carriers are competing now on data algorithms.

It’s going to be an ongoing discussion, but I do think it’s better to engage the regulators in these discussions, rather than to end up playing defence and having terms dictated, so as an industry it’s a broad conversation we need to have.

There’s a whole lot of new questions and new ethics we’re grappling with as a society and those will continue over the next several years. It’s important to have those productive conversations.


Kottenstette: If you think about third-party data, what is the most valuable and has the highest impact?


Junge: It probably depends on the capabilities of your organisation, and how much time you can invest internally. The importance as we’ve talked about is the transparency and back to the black box comments whether it’s the raw data or the risk for a full blown model. Do you know how it’s working and then do you believe it? Have you validated it, is there a way that you can validate it,?

Within Swiss Re, the way we approach data sources is that the raw data is interesting and often good, but it’s nice to have the next step completed, and I think a lot of insurers are looking at that too.

They say ‘I have this raw data but what do I do with it, I don’t have the data science staff’. So it’s getting to the next level where there’s something impactful being provided that you can relatively quickly incorporate into some sort of rating or underwriting process that doesn’t require a lot of expertise on the data science side.


McCathron: When we started Hippo, we used a tremendous amount of outside actuaries. The data we needed had to be that ‘second stage’ data, because we didn’t have the internal capabilities to analyse that raw data.

As we have evolved over time, we have a full actuarial team looking at that and creating a risk score. The risk score we use is not necessarily whether we write or don’t write a piece of business, but it’s how we write or don’t write a piece of business.

We call it the happy path, or the rocky path, within the Hippo culture. As we’ve evolved, we’ve figured out that there are some risks we’re willing to write through a more rocky path, meaning asking more questions, getting more data, verifying information—maybe via a physical inspection.

The flipside is we have policies going to the happy path: as we get our initial set of data, we run it through our risk score, and we think there is little unique exposure here, and customers answer very few questions through that process.

As new insurtechs start to evolve, they also evolve in their use of raw data and how to prioritise that.


Schauble: Also, what can you use the data for? You can use it for selection, for pricing, it depends on where you are in the risk chain. Looking at an individual building is very different from looking at many portfolios of many buildings and trying to determine the relative truthfulness of the data you have across those portfolios.

Part of what we haven’t explicitly discussed here is what can you do then. From an insurance market pricing standpoint, there are certain places where you can then finely tune your pricing to reflect that data, and how it evolves over time.

There are other places where you aren’t going to have the freedom, such as the primary homeowners market, to adjust pricing in the way that you might like, so then it becomes much more of a filter through which risk has to pass, in order for it to be accepted to your portfolio.

The form of the data is informed not only by the capabilities of the company, and where you are in the risk chain, but also by what options you have—can you really fine-tune prices as people are trying to do with telematics?


Kottenstette: How do you see carriers’ spend budgets shifting based on where they acquire data and start to filter it?


McCathron: It’s difficult to answer because everybody is trying to compete for the customer that’s less likely to have a claim. It gets more expensive trying to acquire those customers because everybody is trying to compete for them.

In theory, you want as many of those people as you can get, and to create a simplistic approach for them. The flipside of that is there are good policies that are higher risk, but you can get an appropriate premium. There is an argument that if you can find these and get a higher premium, you’re creating loyalty with that customer.

You retain that customer, so your lifetime value of that customer becomes more valuable to you as an organisation than might the happy path customer, who is constantly shopping because everybody is competing for that.

I don’t think there’s a right or wrong answer here, it’s the way you use that data and continue to find a way to differentiate the two, and to a desirable rocky path person versus one you simply don’t want to write.


Schauble: That’s part of the beauty of the competitive component of the business: there are going to be very different business models in terms of how to answer that question. You could arguably say, ‘I don’t care at all about the input data, I’ll trust the overall market pricing and not spend any money’.

I’ll have the best claims adjustment end of my business and I’ll deal with all of it at that point in time. Or you could say ‘I’m going to outsource claims adjustment entirely’ because it’s obviously popular in the insurtech world, but I’ll select risk much more appropriately.

You could certainly say ‘I’m going to target the highest price highest risk and count the margins’ or ‘I’m going to target the lowest risk and live with the fact that even though the margin maybe lower, it’s going to be much more predictable’.

There’s an opportunity for all the different entities in the market to come up with their own interpretation. It can all rest on the foundation of better, more complete, and more continuous data.

How you then interpret data and how you do business—there’s a whole array of different approaches you can take, depending on what you think your relative strength is and what your goal is. If your goal as a public company is to sell more policies this year than last year, that leads you to a very different set of behaviours than if your role as a private profit-maximising enterprise is to say ‘what I really want to do is pick the individual best risk’.

That’s where the fun exists in the business—the ability to make that interpretation and go forth and find the best way to execute that business plan.