• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Tim Cooke
  • Campbell Ritchie
  • paul wheaton
  • Jeanne Boyarsky
  • Ron McLeod
Sheriffs:
  • Paul Clapham
  • Devaka Cooray
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Piet Souris
Bartenders:

Openstack and Docker

 
security forum advocate
Posts: 236
1
Android Flex Google App Engine
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Would I still need Openstack if I were developing on Docker with an orchestration framework like Kubernetes, Swarm, etc?
 
Author
Posts: 9
5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
There is certainly overlap between OpenStack and other what I would call regional or data center type orchestration tools do, but many aspects of each framework are distinct. Architecturally only you can determine if you can live with one, the other, or both.

Fundamentally OpenStack provides a lower-level (VM, container, metal provisioning, vendor-specific integration) control of infrastructure than any container-only framework. I would say that fundamentally OpenStack is a "cloud operating system" and container-based frameworks are more "application delivery systems". Depending on your your requirements you might not need OpenStack for this, in fact you might no need Kubernetes or Swarm either since you can acquire resources from Amazon.

Professionally I have deployed OpenStack instances to control underlying infrastructure (network, storage, etc.) and run containers within OpenStack.


Some additional thoughts:

Distributed system scheduling and orchestration is an area of my research and as a result I have spent a great deal of time thinking about such things.

I generally think of orchestration systems based on scope of their control and break them down into three groups:

Cluster:
-High performance cluster (HPC)
-Hadoop, Spark, etc.

Data center / Region:
-OpenStack clouds
-Amazon EC2
-Microsoft Quincy[9] and Apollo [10]
-Google Borg[11] and Omega[12]
-Kubernetes [1]

Global:
-Typically application specific

Google Borg [11]: a large-scale cluster management software, which until recently* was considered “Google’s Secret Weapon” [13].
-Two-phase scheduling: find a suitable node, score and schedule best suitable node.
-High (service) and low (batch) priority scheduling, with independent resource quotas.
-Typical scheduling time is 25s. However, global (cluster) optimality is not attempted when making scheduling decisions.

Apache Mesos [14]: an open-source cluster manager providing resource isolation and sharing across distributed resources.
-Mesos began as a research project [15] in the UC Berkeley RAD Lab by then PhD student Benjamin Hindman*
-Mesos has been adopted [16] by Twitter, eBay, Airbnb, Apple and at least 50 other organizations.
-“Mesos is a distributed systems kernel that stitches together a lot of different machines into a logical computer. It was born for a world where you own a lot of physical resources to create a big static computing cluster.”[17]

Kubernetes by Google [18]: is an open-source platform for automating deployment, scaling, and operations of application containers across clusters of hosts.
-Kubernetes is based [19] on Google's Borg and "The Datacenter as a Computer” [20] papers.
-Kubernetes partners include Microsoft, RedHat, VMware IBM, HP, Docker, CoreOS, Mesosphere, and OpenStack*.
-“Kubernetes is an open source project that brings 'Google style' cluster management capabilities to data centers.” [17]
-“Kubernetes goal is to become as the standard way interact with computing clusters. Their idea is to reproduce the patterns that are needed to build cluster applications based on experiences at Google.”[17]

Note: From a scheduling and orchestration level, these are not global (multi-zone) schedulers!

-Most data center scheduling is based on bin packing optimization of CPU, memory, and network bandwidth resources, where resources are assumed to be uniform (by value).
-“Kubernetes cluster is not intended to span multiple availability zones. Instead, we recommend building a higher-level layer to replicate complete deployments of highly available applications across multiple zones”
-Application-centric schedulers like Fenzo*[22] (for Mesos) is designed to manage ephemerality aspects that are unique to the cloud, such as reactive stream processing systems for real time operational insights and managed deployments of container based applications.


[1] http://kubernetes.io/v1.0/docs/design/README.html
[2] http://storm.apache.org
[3] http://samza.apache.org
[4] https://aws.amazon.com/kinesis/
[5] https://hadoop.apache.org
[6] http://cassandra.apache.org
[7] http://orientdb.com/orientdb
[8] https://github.com/ResearchWorx/Cresco/wiki
[9] M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg. Quincy: fair scheduling for distributed computing clusters. In Proc. ACM Symp. on Operating Systems Principles (SOSP), 2009.
[10] E. Boutin, J. Ekanayake, W. Lin, B. Shi, J. Zhou, Z. Qian, M. Wu, and L. Zhou. Apollo: scalable and coordinated scheduling for cloud-scale computing. In Proc. USENIX Symp. on Operating Systems Design and Implementation (OSDI), Oct. 2014.
[11] Verma, Abhishek, et al. "Large-scale cluster management at Google with Borg." Proceedings of the Tenth European Conference on Computer Systems. ACM, 2015.
[12] Schwarzkopf, Malte, et al. "Omega: flexible, scalable schedulers for large compute clusters." Proceedings of the 8th ACM European Conference on Computer Systems. ACM, 2013.
[13] http://www.wired.com/2013/03/google-borg-twitter-mesos/all/
[14] http://mesos.apache.org/
[15] Hindman, Benjamin, et al. "A common substrate for cluster computing." Workshop on Hot Topics in Cloud Computing (HotCloud). Vol. 2009. 2009.
[16] http://mesos.apache.org/documentation/latest/powered-by-mesos/
[17] http://stackoverflow.com/questions/26705201/whats-the-difference-between-apaches-mesos-and-googles-kubernetes
[18] http://kubernetes.io/
[19] http://www.infoq.com/news/2015/08/past-present-future-kubernetes
[20] Barroso, Luiz André, Jimmy Clidaras, and Urs Hölzle. "The datacenter as a computer: An introduction to the design of warehouse-scale machines." Synthesis lectures on computer architecture 8.3 (2013): 1-154.
[21] https://github.com/kubernetes/kubernetes/blob/master/plugin/pkg/scheduler/algorithm/priorities/priorities.go
[22] https://github.com/Netflix/Fenzo
[23] http://techblog.netflix.com/2015/08/fenzo-oss-scheduler-for-apache-mesos.html
 
Sai Hegde
security forum advocate
Posts: 236
1
Android Flex Google App Engine
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Great answer... Really appreciate the details on it.

Also, Does embracing Docker as an Application Delivery System bring any value to the Open Stack?
 
Saloon Keeper
Posts: 28583
210
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
My server farm is for R&D purposes, so I have a small number of physical machines, have a certain amount of legacy infrastructure (which makes me no different than most IT shops), and I have to be prepared to mimic a diverse set of infrastructures. Some of my most powerful equipment, in fact, is only powered up if there's a paying customer who needs it - the noise and power requirements don't justify keeping that stuff online unless it's actually necessary.

I have 2 Docker hosts at the moment, one running CentOS 6 (my major workload) and one running CentOS 7 (because some Docker image builds done under CentOS6 can crash Docker). These are older machines, so the Docker containers are in discrete VMs, not in OpenStack instances. To run OpenStack, a machine should ideally have at least 16MB of RAM, and I'm not sure that one of my Docker hosts can physically go that high.

So my production Docker instances run without the benefit of OpenStack.

On the other hand, a great deal of effort has been exerted in the area of Container-in-Cloud support. Amazon's Elastic Beanstalk and EC2 features, for example. There are similar works in progress for OpenStack. And, if memory serves, Vagrant has a plugin especially designed to construct and spin up Docker containers in OpenStack.

Cloud instances provide the flexibility of being able to spin up and transport entire VM images and do so with a minimum of redundant resources. Docker instances share these virtues as well. So it's no surprise that people have been putting the 2 together.
reply
    Bookmark Topic Watch Topic
  • New Topic