Medical studies are essential to knowing what works and what doesn't work in medicine. There are a few problems, though. There aren't nearly enough studies, they are expensive and cumbersome, the funding is often by groups seeking an outcome, there isn't enough follow-up, most of the data is secret and they are rarely crafted for personalization. Among other things. What can we do?
Often the cure for a problem isn't isolated genius, but finding a field that had a similar problem that got solved and adapting the solution. I propose that the problem of building software (expensive, cumbersome, takes too long, etc.) is similar to that of medical studies, and the solution of making software open source can be adapted to the problem of medical studies. If medical research and data were open source, most of the problems I listed could be solved.
Open Source Software
The open source software movement has revolutionized the industry. Operating system software, for example, was the proprietary crown jewel of computer manufacturers. IBM's 360 mainframe operating system software, for example, took over 1,000 people years to build. A well-known book by one of its leaders, Fred Brooks' "The Mythical Man-Month" went into detail explaining the nightmare.
There's been a revolution since then. The Linux operating system completely dominates the operating system market; for example, it runs on over 95% of the top million web servers. This isn't new news -- Linux was started over 30 years ago! Since then, even major profit-making software companies such as Google (Android, Chrome, Kubernetes) and Facebook (React) sometimes open-source valuable software they've built internally.
Much (not all) open source software is built by volunteers and the resulting software is freely available. Sometimes company employees work on open source that is valuable to their employers. There are hybrid models such as Red Hat, which charge for services they offer to companies that want to use the open source software. After the early years of resistance and skepticism by traditional programmers and managers, open source software is broadly accepted as a fact of life -- and a good fact! -- in the software world.
Open Source Medical Data
The data from a research study is incredibly important to the people whose disease or condition is studied, to the medical professionals who treat it and to the device or pharma company that creates the new device or drug. The results of the study cause the patients (guided by medical providers) to take drugs, change their behavior or undergo procedures that can have a major impact on their lives. Shouldn't that data be freely available to anyone who cares to study it? Just as open source software hugely benefits by having large numbers of volunteers pore over the code looking for errors, limitations and omissions, so would open source medical test data benefit by having large numbers of people who are even more motivated than software contributors comb through the data -- in software, we're talking about annoying bugs, while in medical data we're talking about life and death.
Anyone with software experience knows that no amount of software testing in a lab environment can match what happens to the software when it's widely distributed. When things go wrong with open source software in the field, open source contributors have a real-life test case of error and have a reasonable shot of finding and fixing the problem, contributing their fix to the central source code. With thousands upon thousands of copies of the software working all over the world and motivated engineers responding to issues and pooling their solutions, open source software achieves a quality that can't be matched by dedicated groups of employees working for a company. Much less a government agency.
The equivalent of this for medical testing is to start with opening all the test data to volunteer analyzers, withholding nothing. Releasing all the data that is now kept secret would be a big step forward.
But that's the equivalent of lab testing.The huge value in open source data will come from extending it to more people than were included in the study, and to include much more data about them, both before the formal start of the study to continuous aggregation of data over time. Among other things, this will enable surfacing factors that weren't considered by the original study designers, both from patient history and from medical events that take place after the formal end of the study. For example, this kind of extended data could surface the facts about the relationship between blood pressure pills and going blind, as I describe here and here.
Open Source Medical Studies
There is no reason why paid medical researchers couldn't continue to define and run medical studies in much the same way as they do today, much the same way as for-profit tech companies create software that they then open source. However, they would have to make 100% of their data open source and fully available anyone to investigate.
The "open source" version would be first to expand the selected participants in the study far beyond what would normally be done with volunteers, and second to extend the data collected to everything that is knowable about the participants, both before the start of the study and continuing long after what would normally be its conclusion. I don't claim to know how best to accomplish this, but I know that today the cost of running study sites, qualifying participants and so on is high. A way would have to be found to enable participants to volunteer remotely, and to enable local volunteers to perform whatever actions like drug injection that have to be performed locally and physically.
This process really kicks in when the new drug or procedure gets past the test environment and becomes more widely deployed. It would be good to emulate the open source software practice of having a careful staged roll-out of a new release instead of the current medical practice of unlimited distribution after approval. This would enable reports from the field, enhancing the open source data, to surface problems that weren't clear in the earlier, more limited testing of the new drug or procedure.
Once the distribution gets very broad, there still needs to be a way to surface and report issues. For example, here is a message from Google to enable broad data reporting about one of their products:
Why shouldn't such permission be added to patient medical records, so that as those records are updated for any reason, the updates are added to any relevant open source data collections? This would make longitudinal tracking automatic and painless to everyone involved.
Conclusion
Medical studies and associated data strongly resemble the proprietary operating systems of computer vendors in the 1960's and 70's. Each body of code was created at great expense by employees of the companies. The code (like medical data) was considered a trade secret, never to be revealed to an outsider. Problems usually surfaced after the code was shipped, just as many problems with approved drugs only surface after they are distributed. Manufacturers kept spending more time and money to make their software bug-free in the lab before shipment, but never got it right -- just as drug makers jump through endless FDA hoops prior to approval, and there are still problems. Makers of proprietary software have huge quality problems to this day, as I have documented, which the "free" open source software largely avoids.
Applying open source software concepts to medical drug and procedure testing and tracking could greatly enhance the safety and effectiveness of augmenting the toolkit available to patients who have medical issues. As it became understood and widely used, patients would have reason to have confidence and trust in the medical profession far beyond what many of them have today. Instead of being constantly hammered about how some drug is "safe and effective," which kinda tells many patients that it probably isn't, the open source method would create a level of transparency and openness that would let people draw their own conclusions.
I have been thinking of this issue for a long time; a discussion with Jonathan Bush at the recent HLTH conference inspired me to write it up.