In the last decade or so, there has been an explosion in the number of mobile applications with the total number of apps available on the App Store reaching 2.2 million in January 2017. Mobile experience reached the next level after the launch of the first iPhone in 2007. Since then, there have been a lot of companies creating excellent smartphones enabling users to seamlessly use these applications. A majority of these mobile applications are B2C in nature, i.e they target consumers and not businesses. There is a distinctive characteristic of these consumer-facing apps—SCALE.
Most successful apps grow from a few thousand users to more than a million users in no time. What does this scale really mean for startups that make these apps? Millions of requests per minute hitting their backend servers and terabytes of data generated that has to be collected and analysed to understand user-behaviour. Creating a technical infrastructure to sustain and serve this scale is not an easy task. Yet, we see so many new startups entering the market regularly, offering innovative applications and scaling up easily to handle this traffic pattern as well as data. With limited resources available both in terms of capital as well as people, how are these companies able to do this? The answer lies in the prominent role that open-source software is now playing in the industry.
Open-source software is not a new concept. It started way back in 1983 when Richard Stallman announced the plan for the GNU operating system and the first version of Linux was developed using GNU’s development tools in 1991. Open-source software has seen a significant growth over these years, as seen by the increase in the number of projects under the Apache Software Foundation, their committers and lines of code statistics. Not only has the number of projects increased but there has been a significant improvement in their quality and stability as well. Gone are the days where a lot of these projects used to be mostly about individuals working from remote locations and coming together to work on an initiative. Companies in the internet era that have the largest scale include Google, Facebook, Netflix, etc. which have a huge user base that grows based on massive real-time engagement with their services.
To serve their users, these companies employ an army of engineers constantly working to create the required technology. Most of the systems built by these engineers are generic in nature and solve problems that are faced by a majority of businesses in this space but are not tightly coupled with their core business model. This allows such companies to make the systems they create available to the rest of the startup community as open-source software. In fact, with all the tech giants open sourcing their stuff, it has become a talent acquisition and retention strategy which plays on the mind of the engineers that they target to hire.
When it comes to open-source software today, there are various flavours for developers to use. First is in the raw form, where you directly get the vanilla version from GitHub, build and deploy it in your environment and start using it. This approach means getting involved with the technology at a very deep level and handling most issues either in-house or through community support.
If you want to feel a little more support in the open source world, there’s another option. You adopt software which is open source at its core but is packaged and sold as an enterprise option by companies by enhancing it as well as by adding support on top of it. These enhancements could be operational in nature like deployments/monitoring, tight integration with other technologies or just an addition of features which may be missing, but very helpful for a lot of potential adopters.
Another variant of open-source software is when its packaged as a managed service by cloud service providers such as Amazon and Google. With most startups these days hosted completely in public clouds like AWS (Amazon Web Services) or GCP (Google Cloud Platform), these managed services are extremely useful. If you want to use Redis in AWS, you have Amazon Elasticache at your disposal. If you want to use Apache YARN, you have the option of Amazon EMR. These managed services mostly come with a pay-as-you-go model and a support team of engineers that are constantly working on enhancing these services.
With this amazing eco-system in place and a plethora of options available, new startups and their lean teams can just focus on the core business requirements and deliver features at a much faster rate to their customers. They just pick these underlying systems to form the core of their infrastructure and simply plug and play these technologies to build their technical infrastructure.
There’s Apache Kafka for distributed stream processing platform capable of handling billions of events that your app may be sending, Apache Cassandra to store terabytes of your data in a flexible schema format with super fast read/write rates, Apache Spark to serve as an analytics engine for your data processing and Apache Hive as an infinitely scalable warehouse that can store petabytes of data to give you a very large retention window. If you want caching solutions you have Redis which is an open source (BSD licensed) in-memory data structure store or Apache Ignite that describes itself as “a memory-centric distributed database, caching, and processing platform for transactional, analytical, and streaming workloads, delivering in-memory speeds at petabyte scale”.
As the CTO of an exponentially growing startup, you want your engineers to look at the business problem and provide a software architecture which is an assembly of components put together by identifying patterns that map to problems that have already been solved. Then it becomes an exercise of picking from the abundant choices of open-source software available to insert those components in the architecture. One thing to note is that using open-source technology of any kind needs more involvement from the engineers as compared to simply using proprietary technology. With different business models now fueling this software, the risks of using these technologies and the resources required to adopt these are significantly going down every day.
The author is a CTO at Dream11.