10th LASER Summer School on Software Engineering
Software for the Cloud and Big Data
September 8-14, 2013 - Elba Island, Italy
Roger Barga (Microsoft)
|Home | Speakers | Lectures | Schedule | Visitor Info | Contacts | Registration | Previous sessions|
Speaker: Carlo Ghezzi, Politecnico di Milano
Software is increasingly embedded in unstable settings where changes occur at all levels and continuously. Changes may occur at the requirements level. They may occur in the environment, and thereby affect the domain assumptions upon which the software was developed. They may also affect the computational infrastructure within which the software is run. Changes may lead an existing software into a situation where it fails to satisfy its intended goals. They may lead to failures or to unacceptable quality of service, and thus often to breaking the contract with the software's clients.
Software engineering has long studied the problem of off-line evolution (aka software maintenance). Many applications, however, are continuously running and ask for on-line change support as they are providing service: they require self-adapting capabilities. To achieve this goal, a paradigm shift is needed, which dissolves the traditional boundary between development time and run time. In particular, models must be kept at run-time and verification must be performed to detect possible requirements violations. The lectures start by focusing on the real-world requirements that lead to self-adaptive systems and then discuss how reflective capabilities can be designed to support self-adaptive capabilities. It will discuss the issues involved in run-time verification (in the context of model checking) and in supporting safe dynamic software updates.
Speaker: Sebastian Burckhardt, Microsoft
Data replication is a common technique for programming distributed systems, and is often important to achieve performance or reliability goals. Unfortunately, the replication of data can compromise its consistency, and thereby break programs that are unaware. In particular, in weakly consistent systems, programmers must assume some responsibility to properly deal with queries that return stale data, and to avoid state corruption under conflicting updates. The fundamental tension between performance (favoring weak consistency) and correctness (favoring strong consistency) is a recurring theme when designing concurrent and distributed systems, and is both practically relevant and of theoretical interest. In this course, we investigate how to understand and formalize consistency guarantees, and how we can determine if a system implementation is correct with respect to such specifications. As a special case, we will visit some classical results of distributed systems, and learn about correctness conditions for concurrent objects and replicated data types.
Speaker: Karin Breitman, EMC
Most companies around the world have massive structured and unstructured business data about its projects, products, processes, production, and people. Today's challenge is how to transform all that into valuable insight. Big Data is a recent buzzword used to represent the collection of tools, methods, and techniques that can be employed in the manipulation of very large datasets. Despite the hip and inconsistencies around the term, there's a comprehensible set of computer science skills required of anyone that goes about calling him/herself a data scientist. In this lecture we dissect, define and take a deep dive on some of the (unsurprisingly not new) required Big Data disciplines. We illustrate then with real industry (no Social Networks, sorry) examples – Telco, Health and Oil & Gas. We finalize by discussing job opportunities in industry/research.
Speaker: Adrian Cockcroft, Netflix
Starting in 2009, Netflix built out a set of architectural patterns that are focused on future-proofing the scalability, availability and agility needs of the Netflix streaming video service as a "green field" application, optimized for running on the globally distributed public cloud supplied by AWS. As the architecture matured parts of it were released as open source projects, and now in 2013 they form a complete platform as a service (PaaS) offering known as NetflixOSS. The platform is built using Java, Scala, Groovy and Python. This cloud native architecture is notable for having every service (including storage) be ephemeral; it's use of chaos engines to continuously disrupt services to promote antifragility; achieving high scalability and availability through fine grain stateless micro-services; backed by a storage tier that is triple replicated within a region, and supports global replication across regions.
From March to September 2013, Netflix is also sponsoring a Cloud Prize competition for open source contributions to NetflixOSS. Code and information can be found at http://netflix.github.com. Submissions by third parties include new features and services, as well as ports to other environments such as the Eucalyptus private cloud.
The LASER lectures will cover 1) The motivations and economics of public cloud. 2) The migration path from datacenter to cloud. 3) Service and API Architectures. 4) Storage architecture. 5) Cloud based operations and tools. 6) Example applications.
Speaker: Pere Mato Vila, CERN
On this series of lectures we will cover the full life cycle of the scientific software and its challenges of adapting it for doing big science on clouds and grids. To illustrate with concrete needs we will be using the LHC experiments at CERN, which have recently managed to process more than 15PB data that led to extraordinary discoveries in the filed of High Energy Physics. In general, big science requires big data, thus all the challenges associated with data access and management, but it also requires high performance scientific data processing software that would allow scientists to extract the knowledge from the unprecedented amount of data coming from these modern experimental devices. A large and geographically distributed development team of scientists does the design and development of this software, which for these LHC experiments consists of several millions of lines of code. Moreover, most of these scientists are not formally trained as software engineers. Being able to produce working and performant software is perhaps one the first challenge we need to cope with. We need then to integrate all these software components and libraries into a number of data processing applications that should be able to analyze large amount of data very reliably and efficiently. We need to configure, optimize and validate all these applications for various operating systems and platforms, and finally the challenge of distributing and deploying it on clouds and grids. Perhaps the most challenging aspect is coping with software changes. Scientific software is not very static. New ideas from scientists and better understanding of the experimental apparatus translates typically in new code that needs to be tested, configured, packaged and deployed on the cloud. To really exploit the scientific potential of the experimental facilities and to motivate the creativity from scientists, it is better to be able to upgrade and deploy new software in hours rather than in weeks. Current technologies such as virtualization and clouds can really help big science.
Speaker: Bertrand Meyer, ETH Zurich and Eiffel Software
In the first set of topics covered by these lectures, I will review recent developments in the SCOOP concurrency model intended to support “scaling up”: providing the solid mechanisms required by high-performance computing and big data. The mechanisms include processing of large data structures concurrently and support for object mobility.
The second set of topics covers new developments In methods and tools for distributed software construction, an ever more important model for software projects. It will take advantage of lessons learned both in an industrial setting and in the ETH “Distributed Software Engineering Laboratory” course and project.
Speaker: Anthony Joseph, Berkeley
Mesos is a platform for running multiple diverse cluster computing frameworks, such as Hadoop, MPI, and web services, on commodity clusters. Sharing improves cluster utilization and avoids per-framework data replication. Mesos shares resources in a fine-grained manner, which allows frameworks to achieve data locality by taking turns reading data stored on each machine. To support the sophisticated schedulers of today's frameworks, Mesos introduces a distributed two-level scheduling mechanism, called resource offers. Mesos decides how many resources to offer each framework, while frameworks decide which resources to accept and which computations to schedule on these resources. Our experimental results show that Mesos can achieve near-optimal locality when sharing the cluster among diverse frameworks, can scale up to 50,000 (emulated) Nodes, and is resilient to node failures. Mesos is in production at numerous companies, including AirBnB where it manages the open source Chronos platform, and Twitter where it manages several thousand machines.
Speaker: Roger Barga, Microsoft
Cloud computing allows data centers to provide self-service, automatic, and on-demand access to services such as data storage and hosted applications that provide scalable web services and large scale data analysis. While the architecture of a data center is similar to a conventional supercomputer, they are designed with a very different goal. For example, cloud computing makes heavy use of virtualization technology for dynamic application scaling and data storage redundancy for fault tolerance. And cloud computing often separates compute services from storage services to better support isolation and multitenancy. The massive scale and external network bandwidth of today’s data centers make it possible for users and application service providers to scale from one to thousands of CPU core and pay only for the resources consumed. However, to efficiently utilize cloud computing resource requires developers understand the implications of data center design, cloud system architectures, along with new patterns and practices for programming in the cloud.
This LASER lecture series will cover the 1) data center design and implications on application development, 2) cloud computing system architecture, using Windows Azure as a concrete example, 3) application programming models for cloud computing, and 4) common patterns and practices for Big Data analytics on the cloud using examples from enterprise applications.