Talk by Mark Little, Red Hat Inc on Open source challenges in the enterprise


Another key thing that happened in the early 90s which really facilitated the evolution of open source was the World Wide Web. So Tim burners they wrote the first HTTP demon, while he was at CERN and he made it available freely for people to download. And again, this whole free aspect of being able to build it for your environment, really kick started the web. And if you think about where we are today versus where we were then I don’t know if anybody was around using the web in 92 while no some people were around the web in 93.

You could literally capture all the websites in the world on a single HTML. HTML webpage and CERN was maintaining that webpage. These days we would Google to find out what’s out there, even go to certain anymore, but you would always go to CERN you would see what new websites come online in the world and it was. Was your Judy if you created a new website to try and get in touch with an add you to the effectively the yellow pages of suppose, but this kicked off.

Huge adoption of the web and and in many ways, e-commerce globally for the internet you can trace the success of the likes of Amazon, Google, Facebook, Twitter, Netflix, lots of other companies to the web, and to the evolution of the release in the evolution of Linux, and there are many other benefits that came after that.

Just to give you an idea, talking a minute ago about how the map of the web could be maintained on a single web page. This is a 3d map of the web from 2009. I couldn’t find a more recent one. The color coding is important if you go to that website where you can actually look at the color coding but memory serves I believe. Blue is academic sites, I think. Green is sites based in the US, and then turquoise is UK or Europe, and you can kind of dive into that a bit more detail.

Okay, so we’ve had a quick history of what we were doing before open-source really happened. And to understand some of the challenges of open source and what it has faced, and what it still faces to degree in the enterprise, I want to give you some examples of where it’s used today, in the enterprise and you may even know these you may know some of them you may not know any of them, but I think it’s important to know where it is being used today.

So over the last few years, there has been this huge expansion in the use of mobile phones, mobile data cloud programming language explosion,  lots of things going on that Android have already mentioned uses Linux. on the phone, originally was based on a variant of the Java programming language for developing applications that would actually run on Android. Amazon etc to really use Linux on the back end. From very early days, Facebook, Netflix, all based in one degree or another on open source technology. And like I said, new programming languages in the last 10 years we’ve probably seen more programming languages come online to be used properly than in the previous 30 years together and you know these are just a few examples. But all of these have been driven by open source.

Then, with the explosion of internet of things related to big data, and you know its senses and sometimes senses that are extremely powerful in many ways more powerful than laptops, you might be using 10 years ago, there was also a rise in the amount of data that they were generating a lot of that data that comes from sensors and phones is geo located so it has information about the time the location, etc. And then people start saying well you know we need that day that we want to look at that data we want to analyze that data but how do we store that data and typical what we’ve known as for what are known as relational databases like Oracle DB to from IBM, my sequel which is an open source database. They started Creek a little bit, they weren’t designed for this amount of sheer amount of data that was being generated. So there was a, an explosion in what was known as NO SEQUEL databases, which originally meant no SQL and now it means not only SQL.

But every single one of the logos that you see up that either represents an open source effort, or a company that uses open source in one way or another. So even  Oracle, they acquired some Microsystems, and from the acquisition they got my sequel which is up there. And they also have other activities. Other activities, you know, but everything here is all is all.

And in fact, I’d say that most of the work if not all the work in, in the area of big data and NoSQL has been done in open source first. Possibly.

So, because, because we’re at the cheering for something about artificial intelligence, intelligence and machine learning. I’m sure I could put more logos up there. But patchy Spark,  very popular in the analytics and machine learning space TensorFlow code flow, which is relatively new kid on the block. Again, all open source, being driven by a number of different companies, and a lot of different contributors, but all open source.

So, from this from the previous slides, you know, looking at what enterprise is using, and these combinations of things from open source. You might think that the benefits of open source are pretty obvious and open source is extremely competitive with closed source solutions, competitive in terms of reliability dependability performance cost, otherwise enterprises wouldn’t use it, many of the areas that we see things like NoSQL being used or some of the some of the examples on the previous slide with, you know, Android, etc. are all fairly mission critical, so you don’t want to mess around with these things. And you might think, Well, you know, closed source is pretty much consigned to the history books and, you know, open source is going to be the default.

Unfortunately, it’s a little bit more complicated.

And we’re going to see why it’s been a little bit more complicated.

This is a picture of a gentleman called Steve Ballmer who used to run Microsoft.

And back in 2001 he made this statement that Linux is a cancer that attaches itself to intellectual property sense to everything it touches. Microsoft are going all out to, basically,  Linux, killl open source. At this point, and this is probably the most famous quote of anybody there were a few other companies that will do similar things but I think Steve Ballmer being the character that he was at the time, kind of capture the the media. And this is the thing that ran around the world.

Now, is he was saying this. Not just because open source was seeing this huge interest from developers, and also from industry but also because, as it says it is cold, or a hint at this quote the licensing that gets attributed to open source and this is one of the hurdles that open source in general and certainly in the enterprise has has faced and continues to face. There are a plethora of licenses that you can choose from when creating an open source project. And some of those licenses can restrict how you can actually use those projects in production. In fact, they have knock-on effects on what you have to do once you start to embed them Linux users the GDPR license.

And this is still the case today, but it’s not as bad as Steve was trying to make it.

Usually the kind of concerns that we hear about open source from companies that would like to talk to us, it will get involved in it, and vendors who would like to try and push back on, on that are things like, well, there’s not enough quality open source code out that it’s all, it’s all people playing around in the back bedroom in spare time, the use cases aren’t broad enough for enterprise adoption closed source has been around for so long we know  what you need you trust us we’re in production, why change it if it’s not broke.

And then some quotes Well, at least one quote that I got directly from one of our customers while ago was, you know, open source we’ve been told open source isn’t good enough for the enterprise it’s only good enough for playing around I would, you know, find download and and kind of kick the tires on something and figure out what I might want to do if I then decide to be an adult and go and buy something from my closet company, and really related to that is that the quality of open source developers are not as good as the closest developers that you might get from a close company.

And then there’s another point of where do I start if I’m interested in even evaluating what’s out there in open source stuff.

And let’s look at this last bullet, just for a second about where to begin. Because we think why is that a problem for open source.

Well, you know as Douglas Adams said space is big and open source is kind of big.

Here’s some examples.

GitHub is a very popular site for people to use to host open source projects. And when I took this screenshot.

There were 11 million people using over 27 million projects that are on GitHub, in lots of different languages.

There’s the. Just another quick screenshot of part of the Apache Software Foundation, and the eclipse foundation on the far left, and then down here we have the tables developer side which is where Red Hat hosts many of its popular upstream projects. And as you can see, there’s a lot going on here.

I should say first of all, by the way, Apache Apache, and Eclipse foundations that have sprung up in the last 15 years or so, as places where people can host their projects or kind of predict gets up us gets more or less these days, but they’re really good places to go if you want to have a vendor agnostic experience of various projects so they both, there are other. There are other foundations, but both Apache, and an eclipse stripe very hard to make sure that projects are dominated by one vendor, which is a good thing. If you have a project that’s dominated by a single vendor. And that vendor goes out of business or decided to change their business and go somewhere else. You may then find that you basically can’t support project that you have been using it potentially mission critical environment.

But anyway, like I said, there’s an awful lot going on here a lot of projects, if you’re interested in a particular topic, where do you stop, which one even ignoring the 27 million projects that are on GitHub. If you went to the Apache Software Foundation, and you looked up login you’ll find multiple logging frameworks, which one’s the best one for you which one is the active development which one has been put into an enterprise environment and really be challenged.

So, you can understand why. Where do you Where do I start is a conservative to one degree or another. But, you know, really. This is where companies like Google, Red Hat, and others, collaborate on open source can help to a degree, and also how the projects can later on.

So, other challenges for open source in the enterprise. Many of the default category of this is perception versus reality and I think a lot of the thought is more thought and perception is closed source, more dependable than open source should open source in fact have more challenges to convince people to use it, and develop with then close source challenges that come up again all things like the design and implementation, as I said earlier, you know, there’s this further open source developers aren’t as competent or understand the use cases as close as developers who’ve been doing it for years. So their skills and understanding of the principles in open source of way weekend. So you’re going to get something that you can play with but I would never go to production testing things that are in open source, how do you go about that one of the things that closed off companies have done for years. And usually done very, very well, is test their software before they actually release it to customers battle tested really hard and make sure that it can build, and it can run at scale. How do you do that in an open source project where maybe the developers just typing away on their own personal laptops.

Building on my laptop is not the thing that you want to hear if you’re about to put a project into production. Now, if you say well how do I know works well the building.

So testing and scale, how do you duplicate the environments in which developers build and deploy their applications. So, reliability and security.

As I said, there is this, this photo much less these days because of all that stuff that I showed you earlier that closed source is more reliable. Open Source closed source would say a closed source company would say we are more reliable because we have reliability through experts, our team understands, security, better than anybody else can therefore what we build has got to be far more secure than anything that’s going to happen in open source.

However, the counter argument from open source projects and developers and companies like red hat and Google and many others is actually open source project has many more eyes on it. Typically, closed on projects. So we have peer reviewed software, which leads to more reliability. And we can do stress testing, which I’ll come to later on, closed source, often, if you think about it, can do security through obscurity. They may not tell you that there is a bug in the software and how can you find out until perhaps it’s too late. At least if there is a bug in some open source software and somebody finds it typically unless they’re going to exploit it, which does happen, but typically they will make it know, they will say on a forum or maleness or something. I have found a hole in the software. Can you verify it. And somebody else in the community will look at it, maybe more than one person, and they will say you right, and they’ll also more times than not have a fix for it and it’ll be fixed. Within hours, usually.

So open source enables anyone to examine software for security flaws, like say it does, potentially, open up the ability for people to examine the software and find holes that they could then exploit. But going back to what I said a minute ago. Often, that means that somebody else is also looking at the code and will probably find that hole, and I’ll be at the same time maybe very soon afterwards and fix it.

So, just so you know that I’m not giving you words without any backing, I’ve got a couple of slides to at least try and convince you a little bit more.

In 1988, the very first internet work was sent over, send mail demon thinking people will tell you they’re online.

They were all open source. However, the very fact that they were open source meant that people wants that vulnerability was identified and it was known that there was a worm out there. People looked at the code fix the code and got updates out in a matter of days.

There has been report might have an organization in 2001. They did a report, comparing Apache HTTP versus Microsoft’s Internet with Microsoft’s equivalent still around today. And they found it was more, much more reliable that Microsoft, and they said in the report, open source products have access to extensive expertise, and this enables the software to achieve a high level of efficiency.

So software defects were similar levels between Apache, and IS, however, executive just mentioned, you know, he week labs he did another survey said that you know force organizations in general respond to problems more quickly and openly. And the reason for this is if you want your open source project to survive you want your community to trust you, and you want that community to feel empowered. And if somebody reports a bug, like, you know, a way of doing an internet world with send mail, and you completely ignore them. They’ll probably go off somewhere else and either credit competitive project to you or find an already competitive project and get involved with that so you really want to make your project successful by enabling your community.

There was another company called Coverity who did a four year research on the Linux kernel.

And they found fewer software bugs in that than the industry average of other operating systems, the URL there. The tiny URL it’s actually linked to the research paper that came after that you find all this information there but you know they found 985 985 defects, in 5.7 million lines of code Linux has that time.

Typical program with similar size would have more than 5000 difference. And they attributed this to the more eyes on the problem.

And another study found my sequel had fewer defects than 200 proprietary programs as well. So, you know, this quote here. Open Source model code is written and will carry creativity. Because developers working only on things which they have a real passion for and again going back to what I said earlier, they want their projects to be successful. So if you find a problem with it, whether it’s a bug or, or some feature doesn’t quite work the way you think it should, then it’s in their best interest and typically they’ll want to do it anywhere, to actually fix it to listen to you and fix it.

However, and this is a challenge that still faces every open source project today. You can leave till the end.

Few projects, are product ready by default.

So I mentioned this minute ago about how people might say, hopefully they don’t say very often that it builds on my laptop so therefore you should be able to put into production.

Often, that is what’s happening with upstream projects people are writing some code, maybe the writing unit tests.

And then they’ll push it to get up for instance, that does not mean that that project is ready for prime time. Not everything in an upstream project should ever go into a product or should go into a product. Now, some things might take a while to mature.

We tend to use the term bake in the community, where, you know, whether it’s a feature, or whether it’s something you think is a fix for a bug. You should give your community, and maybe you’re into definitely your internal teams if you have a chance to test and make sure these things are right before you decide to put them into production.

But like, maybe too immature. Some things maybe features that you never, never wants to support.

Because, again, thinking about it, upstream projects, if they’re successful, you’ve got hundreds if not thousands of people, hammering on the code from lots of different organizations, despite the fact that that might be a very very clear direction for the project as a whole. They may be adding things that aren’t exactly in line with what you as the user wants so if you’re coming to take a project and you want to put that into production you might find that there are things in it that have no interest whatsoever and you want to get rid of them. Because you don’t want them to potentially the term project when it’s in production. So sanitization of projects is pretty important to removing things that aren’t ready ever or removing things that are just not quite there yet.

So this is an example of kind of how we try to think about it in red hat, where we have more products than this but I’ll just put the one of the two pretty most popular. At this stage, so Red Hat Enterprise Linux which is our prioritize version of Linux JBOSS application server or enterprise middleware.

Both of those products that are based on thousands, and in some cases  if you follow the transitive dependencies, hundreds of thousands of different projects some of these projects are really, really small just, if you look at the Linux kernel, for instance, there’s a lot of projects in the Linux kernel and you’ve got lots of other projects that layer on top of Linux, they may do  very small things but they’re crucial things and you still depend on them. So we have to take all of these projects, and we have to effectively prioritize it prioritize them into these two products in a way that we could stand behind we can support, and we know how to patch. And that might be we have to sanitize it here and there.

We decided there are some things that could be things we are all engineers working on that we’re just not ready for doing it, or maybe things are in the community that will never want to support. So that in itself  it is a challenge. Not every project goes through this, or number of companies out there that have so knowingly look at open source look at the successes that are going on in other areas and say well I found this open-source project in GitHub or even on Apache or Eclipse must be good to use, I’m just going to use it for production and they don’t actually understand that well perhaps you should have understood the code a little bit more and looked at some things in there that are not quite ready for you yet.

So I’ve already hinted at this before but, you know, one of the key things that goes into turning a project into a product, and it is part of the third that we used to hear from the likes of Microsoft is around testing and stress testing and testing in environments that match those that you’re going to deploy into.

If you can only test on your laptop there is absolutely no way that you can ensure that something that works on the laptop is going to work in an air traffic control system. And if you don’t even think about that I definitely don’t want to be flying on a plane that goes anywhere near an air traffic controller, who is relying on software data that I wrote and be that I tested on my laptop, things you want to understand is unit tests are not the same as QA testing. So if you’ve written unit tests.

Hopefully you, you’ll know this if you’ve run into people who do QA or QE, called the engineering of quality assurances that kind of the same things

you’ll know that one is just, I’m just going to test this feature that I put in, make sure it conforms to you know the signature that goes from a specification document. The other is I’m going to test this thing to come to find it, but I am going to load test this I’m going to throw 100,000 users at this thing in five minutes and see what happens. That is very  hard for your average developer to do when they’ve already got their laptop or their son in the bedroom doing this in spare time. So this means that it tells me the domain of companies like Red Hat but it’s not the issue to bring hardware and people to the problem to look at these projects. That’s a really good project.

But we know it’s probably not ready for prime time, as we download it from the eclipse, foundation, or Github. But we’re going to we’re going to really put it through it through the wringer, we’re going to start throwing up on 100,000 connections.

We’re going to start pushing it to the limit. And as we do that, maybe we’ll find problems. And if we do we will push will find the solutions to those problems, and we’ll make those solutions available upstream to the, to the project that we got it from.

And performance testing is also part of this as well so making sure that is something as I said, that will scale, if you needed to scale linearly as you throw more machines at it. Then we’ll test its size actually the case and open source projects that go into the enterprise, they all need to go through this candidly, doesn’t matter what it’s done through Red Hat or or yourself. You all need to check that this is what’s really good to an enterprise environment.

So hopefully this is fairly obvious for why we need to do these things because I mentioned a little bit but when you deploy to an enterprise you’re not deploying to someplace that is going to be only running for four for four weeks, let’s say all for a year.

We’re talking about air traffic control systems we’re talking about banking, finance, hospitals, nuclear power industry, for instance, we’re talking about places that put software into production that they expect to be there for decades. So banking systems, you know, we work a lot in Boston, the City of London, and they often look at two decades software to be that they need to be reliable and they want to make sure it’s there for the next 20 years 20 years. So, taking something from an arbitrary project, and not doing that kind of due diligence and then, ensuring that you can support it. For these  environments is really risky because you know if you don’t understand the code, and you don’t understand who’s actually hacking on the code and contributing to the code, you could suddenly find that they have gone somewhere else and you are left holding the code that you don’t start a new car patch and banker or worst case scenario a nuclear power station needs you to patch it quickly.

So, you have to account from it. And again, this is something that enterprise software for closers takes into account, they do this, you know, and the processes I used to work at HP with colleagues in the room work for Microsoft and other closest companies. This was what they built themselves up, being able to say to a bank. Yes, we will be here. 20 years time and yes we will be able to keep patching that for as long as you need it.

So, Red Hat, and other companies now don’t believe that because open source has matured so much.

It does help the sector’s, you can now bets on the open source been around for all projects being around in a way that you can support the more that we could support 20 or 30 years. You do not have to think, if I’m looking for something to be around for two decades Do I really have to go to the closest company to actually get that.

So, I did want to say that really open source lives or dies, based on the success of communities, and how vibrant, you know they can be. Red Hat wouldn’t have survived. If we had decided to get involved with communities that nobody was interested in, you know, we wouldn’t have had a business model we would Red Hat would never have grown to the company it is now.

So, you know, if you’re really interested in getting involved in open source like I said at the start you know if this is turns you on again open has some really interesting articles on how to get involved, you know, how to create a project if you find that there’s a need, there is no project out there, how to get involved in a project that might already be there, and getting involved can include things like just using it, and giving feedback to developers, providing updates documentation. You do not have to be a coder to get involved in an open source project. You could just raise bugs and give people enough information to know this thing does not work and this is the environment which it doesn’t work and then ideally say this is what i did to fix it but if you don’t, knowing the bug is there is better than not knowing you would have contributed, and no small measure to the success of that open source project.

So, before we get to some questions I did want to finish with some other examples of where open source is used, and some of them are in the enterprise and some of them are just give you a flavor of the benefits and reach of open source and you may know some of these, and maybe you don’t.

So there’s this little video streaming company that used to sell videos through the post and then decided that that was not going to go very far called Netflix.

They did not start out, open source, but eventually they decided they had built up north lot of infrastructure on doing video streaming on Amazon. Amazon Web Services, and they realized that an awful lot of what they had built could actually be of use to the wider software industry, and they started to open source an awful lot of projects so some of the testing projects that would simulate network crashes and machine crashes they open source that Netflix OSS is a.

If you go to Google that you’ll find this page and you’ll find links to a lot of their projects and they continue to open source the models Explorer.

This was the cover of the executive this came out, most explorer but this article came out about four, four or five weeks ago.

It uses a very old version of Java, which is open source, Java, for something that was around at the time that this ended up.

So there is a there is an open source programming language, running on at least one of the mouse explorers. And interestingly, they use an open source messaging implementation at a point where the, the data comes from the rover goes up to a satellite which is, you know, in orbit around Mars and then that sends it back to the, to the earth.

It’s actually active mq or it was the Large Hadron Collider. And let’s face it, you know, if they are prepared for open source. Underneath the Large Hadron Collider, then they’re going to know what they’re all about because if that goes wrong.

There’s a while if you believe the science fiction. Fake News, we could have a black hole in the center of the earth.

But that’s a they use, they use quite a bit of open source Linux, for instance, and also, I don’t know if they still do but they used to use a variant of our messaging service of Red hat.

Open Source in education. So open source, I believe, it removes artificial barriers between teachers and students. So one of the problems. So I have a couple of songs and they’ve kind of grown up in the open source age.

One of them was just four and then one of them was really into it. And my eldest had real problems in being able to use to duplicate what he’s doing it in school at home because the teachers had Microsoft licenses that he couldn’t get access to all I would not have the money to get access to my, by the time my youngest got into school. They had moved to open source equivalents and things like scratch for instance I don’t know if anybody’s looked at scratch it’s a programming language very very basic programming language, but it’s to demonstrate for loops and things like that. So teachers use that and it’s available for kids to use and it runs on any operating system that you want to use and so kids at home can do homework and can duplicate what they’ve been doing in school. And then there’s another thing down the book called DevOps the kids.

This is a GitHub organization, you could Google that again but it’s on the back of a very popular series of developer conferences called dev box.

The organizers and various vendors and members of that community decided that it’d be really good if they could get involved with kids from like six or seven years up and teach them how to do various things whether it’s programming in Minecraft or scratch doing you know hardware solutions, but everything has to be open source. So, if you’re at all interested in getting involved with kids. Dev Ops for kids is a, is a great place to start I mentioned.

So this one Raspberry Pi’s and Amiga boats are kind of an American version I think of the rush by, but that would not have been two things that made a successful one it’s very very cheap, but to.

It was based on open source software, you know it has Linux on.

So it would not have been successful if it had if they’d had to get a license for a closed source operating system from some closed source operating system vendor, based in Seattle, perhaps, massive massively successful. It is been like four or five years he brought just brought out the raspy pie for 64 bit ARM chip and kids are using these today. They’re able to spend like 30 pounds which comes with free operating system, and they’ve got a computer that they can actually use and program and add peripherals too.

So I mentioned this one Minecraft.I don’t know if anybody has actually played with Minecraft or have kids related Minecraft . Minecraft is Java based. Microsoft bought Minecraft for a billion dollars four years ago.

And I said all Java based huge user community, and a huge development community. So one of the things that makes Minecraft extremely successful is they have this notion of modding. So, because it’s all open source and because it’s based on Java and Java programming language. You can get in that and you can change the behaviors of things so you can have a gun that when you fire it bubbles come out, you can change the radius of TMT for instance, and my son learned how to program in Java by using Minecraft he would never have learned how to program a Java and asked him to build a Minecraft license on Minecraft and share those with his friends. So he taught himself Java he taught himself how to do on zip Minecraft into modern mods and send them around.

So, this is before Microsoft. Still going strong.

So, in conclusion, I think, you know, despite the fact that we had a lot of in the early 2000s and we struggled a little bit today. I think it is fair to say that open source is now mainstream. And if you look back at a number of things I mentioned like big data and mobile niches things. It has been a heart of, of all of the significant waves in the software industry and the hardware industry for the last 20 years and hopefully maybe a little bit surprised about where it is being used.

It is typical to see open source driving development and to have people do open source first.

There are a number of organizations that will only these days accept softwares that are from vendors, if it is open source. The, the European Commission number of years ago, changed the rules for submissions and collaborations to have a very strong open Source remit within them so if you wanted to get together with academics and industry to put a bit in front of EU funding, you had to be able to show how this was going to be using open source.

and I think Like I said, it brings a lot of benefits in terms of code collaboration community etc. And there’s that link again to

So that’s all I have to say I think we have some time for some questions, if anybody has any I have a microphone

I would electric vehicles but automated greater than x maybe you’ve heard of it. It’s an open source platform used by some of the major car companies of the world, follow the connected Jeep so that’s also a great example where he didn’t think they would trust open source but they’ve collaborated to build the common pipework and then adding maybe 20% of their efforts to make it their own. Like Toyota versus Suzuki for example.

That’s a good question

More questions. So, Minecraft is a resource but like was worth a billion dollars to Microsoft. So what’s this for everyone. So, Mike, what I heard was that Microsoft acquired Minecraft for the developer community.

So, the company behind like Minecraft at the time I can’t remember the name, their revenue model is basically they sold a license you should you could build it, but you should that you should get the license version for them and use them.