Big Data

Summary: AI Machine Learning Big Data Math Optimization

This is a summary with links to my posts about Big Data and the fancy algorithms and math that are variously called Artificial Intelligence, Cognitive Computing, Machine Learning, Big Data Analytics, math optimization, etc. I have worked as an insider in these fields for over fifty years.

Huge amounts of time and money are spent on these things, but getting practical, real-world results is not so easy. You can make things worse.

How do you make things better? It starts with data.

Everyone knows that data is important. There’s even a field called Big Data.

What people call Big Data often isn’t nearly as big as people think.

Do the arithmetic!

For all the talk about Big Data, it’s clear that making sure that the data is accurate and complete is too much to ask.

Among other things, the status hierarchy in Data Science makes it clear that anyone with ambition needs to get as far away from the actual data as possible.

Without data integrity, your analytics is screwed. It’s worse when the data is about your health. The integrity of data in healthcare electronic medical records is a major issue.

Even simple insurance provider network data is too often wrong, putting yet more obstacles between patients and the services they need.

Even if all the data were perfect, no one seems to ask why exactly is big data better?

Is Big Data delivering results? Here’s what happened with Big data and Hadoop at Yahoo.

With all the money going to Big Data, people are ignoring the massive benefits of leveraging little data.

The few people who study computer history have noticed that Big Data is remarkably similar to what used to be called EDW, Enterprise Data Warehouse.

Same thing with a new name. Hmm. Could it possibly be that Big Data is just a fashion trend?

At least for insiders, there's lots of humor to be found in data. There's even a book of the fun highlights.

After data, you move to the amazing variety of techniques available, each of which has specific areas of applicability.

Even a narrow-sounding field like Machine Learning has incredible variety and lots of things you need to do until the algorithm can be effective.

The simple concept of closed loop is essential to getting and keeping good results, but isn’t applied as often as it should be.

The application of the principles to healthcare has proven to be challenging, with high-profile failures. But there are successes.

I've discussed the application of the principles to fintech, with a focus on anti-fraud. The current leader in the field displaced the incumbent who was using neural network technology, showing that the algorithm chosen and how it’s applied makes all the difference.

Natural language AI has proved notoriously hard to make practical. Here’s an example of how it’s been done in fintech customer support.

All too often, AI research is conducted in isolation. There is a great deal to do to assure that any results that are achieved can be integrated into production systems.

As is true in most fields, in healthcare the money and attention tends to go to expensive, fancy methods requiring PhD’s instead of simple things that actually work.

You don’t need AI or cognitive computing to discover or promulgate the new discoveries that humans make.

There are many ways to spend lots of time and not get practical results.

Even when an optimization technique is perfected and proven in practice at scale, it can take decades for it to be used in other relevant fields.

Impressive analytics performed by recognized experts are subject to getting the results the Expert wants.

Corruption of data or process can lead to bad results anywhere.

For cynical definitions in the tradition of Ambrose Bierce of Big Data, Machine Learning, Deep Learning, Cognitive Computing and AI, see these.