Eating right at the open source buffet
February 01, 2013
With the smorgasbord of Open Source Software (OSS) available for developers to dine from, it's vital they "eat right" by choosing the OSS compatible w...
Open Source Software (OSS) offers intelligent systems designers a veritable smorgasbord of tools and technology. Spanning the entire software stack, from boot code and drivers to OSs, executives to middleware, and application components to development tools, OSS provides readily available alternatives to both legacy commercial software and also to in-house code developed from scratch.
But dining at the open source table is not an embedded bean feast – code gathered à la carte might not always integrate easily to make a well-formed “meal.” While literally millions of OSS projects are available on popular forges and hubs, developers must take care to choose the right technology ingredients and tidbits to fit project and intellectual property needs.
The following examines resources and tools for discovering OSS projects and metrics for those projects. It also explores factors to consider when choosing OSS projects and components for embedded designs. And it serves up heuristic methods for choosing the OSS technology most appropriate to real-world embedded development needs.
Discovering embedded open source
Finding open source software is easy. Finding the right piece of OSS can be much harder. Luckily, options for finding and evaluating OSS are plentiful, and come in five categories: search engines, hosting sites, individual project sites, dedicated OSS discovery tools, and embedded platform distributions.
Search engines
Google, Yahoo!, Bing, Baidu, and other general-purpose search engines actually do an okay job at ferreting out OSS projects. A quick search on the string “open source embedded database,” for example, yields a rich mix of references and actual project sites and repositories. But while search engines are an okay starting point, using them can yield scattershot results.
Hosting sites, foundations
Another path is to go right to the source – the forges and hubs that host multiple projects. Until a few years ago, SourceForge would have been a developer’s prime destination, with its collection of 450,000 project repositories. But today, new projects are likely to find homes on GitHub (with 2.4M unique repositories), CodePlex (32,000 projects), Google Code (10,000 projects), Gitorious, and a long tail of other sites.
Yet another type of locale for project hosting is the gamut of open source foundation forges – the Apache Foundation, the Outercurve Foundation, the Eclipse Foundation, and others. These sites bring together usually related bodies of code (for example, IDE elements and plug-ins for Eclipse) and can boast several hundred hosted projects.
While repository aggregations and foundation sites are searchable by themselves, each still constitutes a distinct silo; however vast their portfolios may be, they don’t cover the entire universe of open source.
Project sites
Some projects eschew the crowded forges and build their own dedicated Web sites and repositories. These may be projects of broad community interest, of greater maturity, or merely the result of technical vanity. In any case, the main challenge is still finding the project, not in the relatively limited haystack of a forge but in the larger universe of the World Wide Web.
Discovery portals and tools – The Michelin Guide of OSS
Probably the shortest path to finding and also evaluating open source projects lies in portals that help developers discover, track, and compare open source code and the projects behind them. These free portals include Ohloh.net (owned by Olliance Consulting parent company Black Duck), Google Code Search, and others. These services track the full gamut of open source software, and like the projects they monitor, they are themselves open, letting users introduce new project repositories for cataloging and analysis.
OSS management platform tools also exist to help developers discover suitable homemade open source as well as “in the wild.” At companies with established policies for OSS use and deployment, developers can use these tools to peruse directories of vetted/approved open source code documented and/or maintained by their employers. These portfolios can also include code built and managed under the umbrella of “inner-source” and “corporate source” programs.
Embedded platform distributions – Prix fixe meals
If the organization has already committed to a prepackaged embedded platform distribution – a commercial or community-based Linux tool kit, an Android SDK, or equivalent – then engineers already have a library of applications, middleware, and utilities at their fingertips. Embedded distributions typically comprise 250 to 500 packages, with each package containing one or more unique, ready-to-use pieces of project code. Unlike downloading code directly from project sites, embedded distributions and SDKs usually include prebuilt versions of project code, tested and vetted for integration compatibility across packages. In many cases, these versions might not be the latest and greatest, and developers might need to turn to the original project sites to access the more current features and bug fixes. However, switching to newer versions of projects, while attractive, can break compatibility with other code in your stack, and also fall outside Service-Level Agreements (SLAs) from commercial suppliers.
Evaluating options, refining the OSS palate
Finding potentially useful code represents only half the challenge. Developers must also vet discovered code across a variety of parameters to determine if it is technically and legally viable. Factors to consider include code size, language, and quality; community history and dynamics; software licensing; and provenance.
Code size – Legacy embedded designs face severe constraints on code size. While tumbling DRAM and flash memory prices have made parsimonious provisioning a concern of the past, embedded software still benefits from compact code. Memory and storage eaten up by utility and infrastructure code are unavailable for differentiating software and for end-user content.
Because OSS starts with source code, the memory footprint of a given project or software component isn’t always obvious. Moreover, today’s device-based software stacks can contain ingredients cooked up in traditional compiled/assembled languages (C, C++, assembly), byte-code executed Java, and scripted/interpreted languages (PHP, Python, Lua, and so on).
The sites and tools mentioned earlier report both the language of projects and the Lines of Code (LoC) in each. If a project is truly size-sensitive, the best approach is to download and build the source code to determine actual binary size (or just examine the total size of scripted/interpreted code). Figure 1 uses Ohloh reports to compare source code growth in three database projects over time.
Language – Implementation language is as important as functionality and size. If a project is being developed in C, projects in Java or Python probably won’t integrate well into the existing or planned software stack.
Code quality – Code quality can prove rather difficult to gage. OSS discovery portals do report how well commented/documented OSS projects are. Other tools exist to vet the quality of code contained within a project, for example, open source Sonar and the popular Coverity suite.
Community dynamics – Important metrics of the health and quality of open source projects lie in the size and activity of the community behind it. Some hosting sites offer historical participation metrics, and some sites include contributor data and activity over the lifetime of a project. Figure 2 uses Ohloh reports to compare the waxing and waning of the developer community over time for three database projects.
Commit history – Tied to community dynamics is the commit history for a project – how often are changes committed to project repositories, over the project lifetime and for recent timeframes? In an immature project, change can appear to be fast and furious; for moribund projects, commits drop away to zero. Viable, stable, mature projects lie somewhere in between.
Licensing – Dealing with the diversity of open source license types and requirements is beyond the scope of this article. Of the 2,200+ recognized licenses, developers are most likely to encounter perhaps a dozen. (See a list of the top 20 open source licenses at http://osrc.blackducksoftware.com/data/licenses/; these account for 90 percent of all projects). The most important open source licenses are the GNU General Public Licenses (GPL, LGPL, AGLP), the Apache License (APL), the BSD license, the Mozilla Public License (MPL), the Eclipse Public License (EPL), and a handful of others. Learn more about these and others at the Open Source Initiative, OpenSource.org.
A larger challenge lies in reconciling project licensing with a company’s Intellectual Property Rights (IPR) governance and compliance programs. A related challenge is reconciling the requirements of different licenses for diverse code integrated into a single software stack.
Provenance – Knowing the actual origins of code can also help in finding support for code, as well as protecting the company from potential legal challenges. Many useful and important projects are associated with commercial organizations that help maintain the project and provide support for it. Most projects have a unified copyright (note: the Linux kernel does not), and many have established processes for determining provenance (for example, certificates of origin for code submission).
The choosy OSS diner
The goal here has been to serve code-hungry developers useful pointers for discovering, vetting, and ingesting open source software. The diversity of options and the surfeit of licenses need not require a particularly adventuresome technology palate – OSS is today truly mainstream and it is a rare embedded project that does not use and/or deploy open source software tools and components.
Matching the right OSS technology to your project is less like rocket science and more like pairing wines and food. More time “tasting” OSS will teach you where to look for compatible coding languages.
Olliance Consulting, a division of Black Duck Software [email protected] www.ohloh.net
Follow: Twitter Blog Facebook Google+ LinkedIn YouTube