Category:DevOps & Design’

Dynamic-Periphery.com by McDevOps – You can take it with you.

 - by Asher Bond

Just got back from International CES. Nice to see some familiar faces and meet many new people! McDevOps makes computers for DevOps. The newest computer we’re working on is called Dynamic-PeripheryTM. Unsatisfied with the one-to-one constraint of personal computing, we decided that a workstation isn’t a personal computer. One workstation, powered by supercomputers, could be accessed by many tablets. But we couldn’t just use any tablets, we needed dynamic periphery. This means that one user may use several tablets in order to have a more tailored user experience, and be able to send their user experience to another user. Portable user experience is one of the most exciting features of cloud-based virtual desktop infrastructure. So we took it a step further and began designing specialized tablets for use with desktop supercomputing workstations. It just makes more sense in today’s software engineering, video production, and enthusiast gaming environments. After all, if DevOps culture doesn’t constrain what-could-be by what-is, then why should hardware constrain platform service software? We think it’s also better to have a consistent user experience in development and production and we think a common yet flexible software framework (for example prototype-friendly structure programming in Dart or open PaaS frameworks like CloudFoundry and OpenShift) facilitates this efficiency in many software engineering practices. We’re also excited about companies like Canonical who have commited to providing top-notch long-term support for service-orchestration frameworks.

Dog-fooding the Supercomputer

But honestly, localized computation is only half of the fun of cloud VDI. We really wanted to rock the portable UX over the internet globally. And that’s doable with a McDevOps microcloud account (contact me if you want an invite), whether or not you roll your own microcloud. Microcloud accounts will be free for engineers, developers, designers, and devops culturists… and in general free for anyone looking for work or something to hack on. But it’s not just a SaaS model, it’s a PaaS model from a software perspective. From the hardware perspective it’s a gateway appliance taking you through the pearly gates to supercomputing heaven in the cloud. Desktops are a heavy workload in and of themselves, especially in the aggregate. The problem with all the cloud hype in consumer electronics or “personal cloud” is that they’ve gotten away from cloud computing’s future value. The future value of cloud computing is that it offers scalability. As Dave Nielsen says Cloud computing is OSSM (On Demand, Scalable, Self-serviceable, and Measureable), and I say it’s OSSAM (adding Automation which is implied in every letter of OSSM)…. consumer electronics manufacturers haven’t really delivered the scalability components, but rather what seems to be an overprovisioned appliance or box. The cloud is not a box, nor a puppet show, but maybe more like a vending machine. Get served.

We might be engineers or developers but we’re often not a this-or-a-that we’re often both. And I think in DevOps culture this is the case. I think it’s also the case that a desktop hybrid microcloud can handle heavier video production workloads much better than a beefed up mac (request demo), due to parallel elastic provision at hyperscale supporting rendering workloads for example. And that’s just one example because rendering is just one video production workload. And when these guys get bored they play LAN parties which works really nicely with a desktop microcloud in your cube farm or wherever.

So think how software engineers play with supercomputers while video producers play 3-D shooters. It’s a competition, but for practical purposes the same infrastructure is used to prove the concept that collaboration is like competition on steroids… especially when you can use the same tools and share the same big data insights.

So at CES this year it really seemed as though cloud either meant wireless or SAN or NAS… but I think cloud storage is a nice low hanging fruit. Cloud persistence is the other benefit of microcloud. It’s a gateway to public utility persistence of files. So it takes the load off your tablets and keeps things locally accessible via ultra high speed bandwidth while it slowly persists remotely in heaven… eventually consistent and redundantly persistent… You can take it with you.

The CLOUD is real… now what?

 - by Asher Bond

The CLOUD is upon you

In 2011 many were still wondering if the CLOUD really meant anything in terms of technology, dollars, and or cents. Looking back on 2011 all I can see is a whirl of nebulocity surrounding what-is with what-could-be. Here’s what I think might change significantly in the next 12 months or so:

The CLOUD is real… WHERE’S MINE?

Ok, so we’ve seen people make money off cloud… now I want one. Go build me my own thing that makes money too. Make it look like the King and maybe the King will be forced to buy it… I mean there can’t be 3 kings can there? So now that 2012 is almost here… people are realizing the cloud isn’t just a nebulous swirl of vapor-ware… now let’s start the ASP second chance foundation. Do I need a license for that? I think there will be a lot of opportunities to abstract licenses with SaaS deliveries. Some may exploit the gimmicks that should not have been codified into the licenses in the first place. What comes around goes around, but by now the only ISVs who are likely to be affected by it are the monolithicly most comprehensive solution providers who claim they invented everything. Invention by consolidation should be on the rise in 2012, by the way, I’m guessing.

ASP Second Life


Application service providers were right. Applications can often be served better warm, with human love. At minimum viability, a product contains at least one service component. Automation is great, but services contain humans and humans contain human error. Consumers love to cut out the middle-man, but once they’ve made all their man-in-the-middle attacks and all their paper dolls of sliced and diced middle-men they realize that they want service. So they go to http://asherbond.com/contact and ask for technical advice. Anyone who knows Second Life (or other virtual realities) knows that people like to design things and build things themselves. But if you’re going to build a cloud please ask yourself where the economies-of-scale exist. Now that the technology concepts have been proven in business practice many more customers are going to ask for cloud service, but what they’re really actually asking for is people (sometimes via a RESTful API).

The difference between application services and software-as-a-service is abstraction measured by a degree of multi-tenancy.

Compliance-and-Regulatory-Tunneling-and-Channeling-as-a-Service

They thought regulations and compliance “hurdles” created jobs… and they were right… in the short term… but what they might have missed is that it also creates jobs for service providers who can broker emerging technology as a service.

Business-Process-as-a-Service (#BPaaS)

What kinda cloud u talking bout? We got SaaS BPaaS and my personal favorite: GSaaS. GSaaS loves you brother. Now let me show you how to run your business. I expect to hear a lot of “what kinda PaaS” from developers and a lot of ooooo aaaah from business process practitioners… but the process consultants deserve a chance to really shine and this is it. I got my developer card revoked a couple times for saying “Cloud is SOA” but I got a new one from VeriSign and now I think developers are starting to be cool about it now that they realize that OASIS was right and that so was I since I said so too, neh. The first guy who raked my graphic depictions over the campfire did admit however, “yeah ok man.. i guess if you’re talking about REST.” So it turns out predictions in 2010 were accurate. I think service-component architecture and visual programming are going to play a role in RESTful integration as software components are service-oriented. I strongly expect scalability requirements and cloud-readiness motivators to stir the pot. Service-orientation is inevitable when technology is applied. Developers are empowered as decision makers and technical advisors, so maybe they would be interested in subscribing to business-process-as-a-service since they have more of a technical focus.

The most COMPREHENSIVE solution – brought to you by the Federated Association of Governing Consolidators

So what if you’re an investor and you buy and sell technology securities and you want some of that good old fashioned ROI. How can you make any money in this cloud biz now that the developers are taking over? Oh yeah there’s this little thing called the most COMPREHENSIVE solution. Big comprehensive, little solution. That’s right folks. The time is NOW. Buy everything. Your cloud portfolio is about to make it rain, but before you buy everything… you have to know how this stuff works and what it does. Haha just joking… now back to our regularly consolidated program… I think in 2012 we might continue to see enterprisey comprehensive solution providers trying to convince people that they are the box you can put your cloud into… or are they more of a comprehensive solution “cloud” that spans actual clouds with meaningful definitions which exist in actual physical datacenters? Who gives these large enterprisey comprehensive solution providers the authority to do this? The customer lets them get away with it because they sponsor industry events and they are often older companies who played a role in many of the technologies that end up as cloud. They equivocate between distribution models of cloud computing, for example… they might get behind the technology curve doing tons of non-emerging has-been-mature-for-a-decade-or-so SaaS business then pretend they are powering IaaS today on a public scale… when the emerging technologies are PaaS based.

DevOps as more of a cultural paradigm shift and movement and less of a title

People are going to start either killing each other based on their choice of configuration management / automation framework or they are going to start getting along more and not putting DevOps in their title unless it has Engineer at the end of it and Lead in the the front of it. Designers are going to be constrained by tighter iterations and Ops are going to punch developers just because they haven’t been punched before and everyone goes through it.

Developer-as-a-customer

In the old days, developers could be divided and conquered by business managers much more easily. The days of developers having a great idea that no one understands are not over… but “I don’t understand how this stuff works” is no longer an excuse now that we have so many services available. If you don’t know how something works… just ask… only now… you don’t even have to ask how to do it, you can ask for service. If you don’t know how something works, that something might be new and valuable. Dustin said it already, but I think public offerers are going to focus more on influencing the decisions of software developers. Software developers represent change in the direction of requirements and demands… not just whatever seems wanted right now… I think developers often try to guess (like Steve Jobs R.I.P.) what people need since they’re probably going to want that eventually. I could probably guess that a pregnant mom is going to be in the market for diapers sooner or later. Hopefully sooner rather than later. Developers are in the early stages from cradle to grave. They iterate through software development and application life cycles and deliver features based on requirements. Those features become part of a common framework that can be offered more publicly. It’s not new, but software vendors love to put developers on their platforms. What’s new is that developers are not-so-divided and not-so-conquered… so they probably demand a higher degree of ubiquity in their distribution channels… so they probably demand a higher degree of interoperability in their language frameworks.

Applications are most portable when the target distribution platform is based on open-standards.

Public Platform-as-a-Service (PaaS) Top Doggery

Not everyone can be King of the Hill, but I think there’s room for a whole circle of winners in the market segment of public PaaS. We have seen 3 generations of public platform service offerings to developers:

Totally Rigidly Arcane PaaS

The first platform services with public offerings forced the developer to conform to a proprietary framework. The back end was a confidential operation delivered as a multi-tenant service to subscribers who learned how to conform to the proprietary framework. The framework may have been based on python or java, but constrained the developer to the platform of implementation rather than the standards of the enabling technologies within.

Still-exploiting-the-constraint PaaS

This type of platform is built secretly and operates as a proprietary service, but relies on open-source components to deliver services which are mostly compliant with open-standards. A true language is always an open-standard.

Open PaaS – as it should be

Third generation platform services are completely portable. This type of middle-ware essentially replaces the role of the “operating system” as a software component with “systems-in-operation” instantiated as objects by a framework of classes delivered as a platform of services for developers to build things on top of. The distribution model allows for services to be delivered with scalability, flexibility, interoperability, high availability and the distribution model also allows for platform portability and application interoperability by default. The evolution of service-component architecture (SCA) and visual programming may also influence the adoption of visual programming in the cloud as practical users are abstracted by service and frictionless design becomes the practice.

Next Generation PaaS+

I think of PaaS+ as a value-added platform-as-a-service which may include business processes as a service or may include additional DevOps tooling or methodologies-as-a-service (MaaS?) whatever… The framework (tool) teaches you the process. In a toolcloud you might experience something like a toolbox… for example when you’re using Gmail, you realize that Gmail is a Google approach to email… it’s not just an “email program” … so you get some agility along with the nebulocity of the cloudy SaaSfulness. So I think that the next generation PaaS+ will need to put their pluses on by adding some kind of business or other practical high level value. Some of this high level value can be delivered in the form of integration. Cloudbees has moved forward with their initiative to add continuous integration via Jenkins/Hudson integrated service components in their PaaS offering. I think DevOps toolclouds will emerge via the PaaS delivery model and that like Cloudbees other cloud service providers who have a PaaS offering may choose to offer a chocolate or strawberry new flavor of PaaS for Dev and possibly a vanilla PaaS for their long term support in production interoperability and highly available portability PaaSes. I guess Leiloo Dallas could call that one a multi-PaaS just in time to kiss Korbin and save the world before New Years.

Predictive Monitoring and SLAs

Predictive monitoring tools will leverage Hadoop and other big data / analytics. The abstraction of data itself may become an abstract business-process-as-a-service and drive innovation in system performance as SLA’s are enforced and predictive deep monitoring tools allow autonomous and dynamic autoscaling of instances in resource pools.

Resource Pool Expansion and Utility Computing Commodotitization

I think the price of public cloud will start to look like a true utility and come down quite a bit. Companies like Amazon Web Services probably would lower their prices is the demand wasn’t way too high. When more IaaS vendors such as Rackspace, Opsource, Datapipe, et al.. enter the space (they’re already here) and start to compete for customers, the price of raw x86 compatible IaaS should come down quite a bit and make people re-think their hybrid strategies. For now, many organizations may benefit from a flexible hybrid cloud strategy that (for example) may leverage their existing infrastructure to orchestrate public cloud services.

Security implications of Cloud Computing

Cloud computing lowers the barriers to entry by people who ordinarily could not access high performance clusters of nodes to do complex brute-force math research on your “encrypted” password… or just fire up an array of nodes and aim it at the ssh port. Nothing they couldn’t do in the old days of dark matter / botnet clouds. What IP address did that come from? A leased one in a classy datacenter. I think public cloud providers are going to become very security-savvy (actually they really are top notch in most cases). It will be interesting to see how they empower themselves from the big data + hypervisor perspective.

Rinse that CLOUD out ‘cha mouth boy!

At some point… analysts are saying that there is a “hype cycle” in which cloud word sentiment shall become stale. The word cloud will either become ultra-ubiquitous like industry insiders are saying… or it may become a bit blase.. numb from the excessive nebulocity of smoke and mirrors becoming clouds too. I think if we can refrain from partying too hard it might help. Happy new years eve. Be responsible and make backups.

DevOps Day: When Success is 99% Failover – How Availability Can Persist in the AWS Cloud When Network Events Also Persist in an EC2 / RDS Region

 - by Asher Bond

Some might refer to today as a DevOps Day… and to those who haven’t figured out their failover strategy, today might seem like the day the cloud stood still. But if you’re familiar with Internet service at large, you’ve seen it before. Network events persist, whether it be in the datacenter or in the Cloud… a sad hardship we face on shared networks such as the Internet. Remember that infrastructure services such as EC2 and DBMS services such as RDS are merely service layers on top of a data-center. Are you afraid of Cloud or data-center? Fear not, but perhaps the biggest “cloud” is the dark one powered by those who allow their computers to be compromised. If a denial-of-service attack is distributed, a provider-of-service defender should work just as hard to distribute his or her eggs… well… I guess her eggs in multiple baskets. Failover is a difficult concept for many applications, out of the box, because it requires a great deal of redundancy and synchronization. The database is perhaps the most difficult piece of the puzzle to distribute… especially if it is a relational database. Master -> Slave replication is one way to achieve not only multi-tiered horizontal scalability on demand, but also multi-regional redundancy. Take a look at the reference architecture just announced as part of a Rightscale + Zend horizontal scalability solution:

The separation of static content from dynamic content is a concept that will lead to higher efficiency and higher availability in any Cloud environment. Backups from master to slave databases may seem expensive across availability zones, but perhaps, after today, they are less expensive than we once thought.

Now let’s think about Content Distribution Networks. Static content can be cached at the edge which provides the most availability to your end users. When people think of CDN availability, they might assume “closest geographical region to the end user”… but what if your CDN was smart enough to weigh latency and system load as metrics in the load balancing determination algorithms? Do we have that? Yeah. Skeptics blame AWS / EC2 for today’s hardships, but perhaps some should be thanking them for edge-caching static content worldwide. It’s a saving grace for those who have their eggs scattered amongst 18 geographic regions.

For static content, content distribution networks often have multi-region high availability built in out-of-the-box. It’s a lot easier when dealing with static content, but with some systems architecture and database management expertise, the same caching principles can also be applied to maximize reliable delivery of both static and dynamic content.

If an application provider or platform service provisioner can separate static content from persistent data and also separate important data from not-so-important temporary / session data and deliver these types of data and content with discardable instances… fail-over can be achieved and even automated by replicating data across providers (or at least across cloud regions / availability zones). Once static content, persistent data, and temporal data have been sorted out… a redundant, meshed / multi-homed front-end server-array tier can determine (based on monitoring and availability metrics) which cloud / data-center / availability zone to distribute static and dynamic content from.

I think this type of architecture can be justified not only for fail-over reasons, but perhaps it can also be a way to achieve more rapidly elastic, impressive server performance.

When N. Virginia gets hit hard, it may be quite a hardship, but it shouldn’t be too hard to fail over to your other region’s slave database. If Soichiro Honda is going to tell me that success is 99% failure, then in the case of distributed, edge cached, redundant web systems architecture… perhaps success is 99% failover.

But don’t just go throwing shuriken at network-event coordinators unless your star has more than just these two points. I think a nice third point to sharpen and cut to is the reliability of monitoring systems. It’s good to be monitoring your auto-scaling processes if you’re in a situation where you scale on demand… and you also want to monitor who is demanding the computing resources. Ideally, you’re getting alarmed before your end users are. Reflexive firewalls are a good way to go, but just having good reflexes is part of wearing the agile cat’s hat in general. If you have a fast way to report trouble to the authorities charged with ownership of a compromised node attacking your system, you’re part of the solution and get a gold star.

Conversely, unnecessary reflexive post-mortem backups en-mass may have been a somewhat panicked response to the network event and a contribution to the length of this outage.

Amazon Web Services has done an excellent job (as always) of not only describing what happened and when service is expected to be restored, but what you can do to maximize availability if your service has been adversely affected by the outage. You can access their status updates via RSS feeds directly from the AWS Service Health Dashboard at status.aws.amazon.com.

Here’s a copy of what AWS is saying about EC2 services in the N. Virginia region [ RSS ]:

1:41 AM PDT We are currently investigating latency and error rates with EBS volumes and connectivity issues reaching EC2 instances in the US-EAST-1 region.
2:18 AM PDT We can confirm connectivity errors impacting EC2 instances and increased latencies impacting EBS volumes in multiple availability zones in the US-EAST-1 region. Increased error rates are affecting EBS CreateVolume API calls. We continue to work towards resolution.
2:49 AM PDT We are continuing to see connectivity errors impacting EC2 instances, increased latencies impacting EBS volumes in multiple availability zones in the US-EAST-1 region, and increased error rates affecting EBS CreateVolume API calls. We are also experiencing delayed launches for EBS backed EC2 instances in affected availability zones in the US-EAST-1 region. We continue to work towards resolution.
3:20 AM PDT Delayed EC2 instance launches and EBS API error rates are recovering. We’re continuing to work towards full resolution.
4:09 AM PDT EBS volume latency and API errors have recovered in one of the two impacted Availability Zones in US-EAST-1. We are continuing to work to resolve the issues in the second impacted Availability Zone. The errors, which started at 12:55AM PDT, began recovering at 2:55am PDT
5:02 AM PDT Latency has recovered for a portion of the impacted EBS volumes. We are continuing to work to resolve the remaining issues with EBS volume latency and error rates in a single Availability Zone.
6:09 AM PDT EBS API errors and volume latencies in the affected availability zone remain. We are continuing to work towards resolution.
6:59 AM PDT There has been a moderate increase in error rates for CreateVolume. This may impact the launch of new EBS-backed EC2 instances in multiple availability zones in the US-EAST-1 region. Launches of instance store AMIs are currently unaffected. We are continuing to work on resolving this issue.
7:40 AM PDT In addition to the EBS volume latencies, EBS-backed instances in the US-EAST-1 region are failing at a high rate. This is due to a high error rate for creating new volumes in this region.
8:54 AM PDT We’d like to provide additional color on what were working on right now (please note that we always know more and understand issues better after we fully recover and dive deep into the post mortem). A networking event early this morning triggered a large amount of re-mirroring of EBS volumes in US-EAST-1. This re-mirroring created a shortage of capacity in one of the US-EAST-1 Availability Zones, which impacted new EBS volume creation as well as the pace with which we could re-mirror and recover affected EBS volumes. Additionally, one of our internal control planes for EBS has become inundated such that it’s difficult to create new EBS volumes and EBS backed instances. We are working as quickly as possible to add capacity to that one Availability Zone to speed up the re-mirroring, and working to restore the control plane issue. We’re starting to see progress on these efforts, but are not there yet. We will continue to provide updates when we have them.
10:26 AM PDT We have made significant progress in stabilizing the affected EBS control plane service. EC2 API calls that do not involve EBS resources in the affected Availability Zone are now seeing significantly reduced failures and latency and are continuing to recover. We have also brought additional capacity online in the affected Availability Zone and stuck EBS volumes (those that were being remirrored) are beginning to recover. We cannot yet estimate when these volumes will be completely recovered, but we will provide an estimate as soon as we have sufficient data to estimate the recovery. We have all available resources working to restore full service functionality as soon as possible. We will continue to provide updates when we have them.
11:09 AM PDT A number of people have asked us for an ETA on when we’ll be fully recovered. We deeply understand why this is important and promise to share this information as soon as we have an estimate that we believe is close to accurate. Our high-level ballpark right now is that the ETA is a few hours. We can assure you that all-hands are on deck to recover as quickly as possible. We will update the community as we have more information.
12:30 PM PDT We have observed successful new launches of EBS backed instances for the past 15 minutes in all but one of the availability zones in the US-EAST-1 Region. The team is continuing to work to recover the unavailable EBS volumes as quickly as possible.
1:48 PM PDT A single Availability Zone in the US-EAST-1 Region continues to experience problems launching EBS backed instances or creating volumes. All other Availability Zones are operating normally. Customers with snapshots of their affected volumes can re-launch their volumes and instances in another zone. We recommend customers do not target a specific Availability Zone when launching instances. We have updated our service to avoid placing any instances in the impaired zone for untargeted requests.
6:18 PM PDT Earlier today we shared our high level ETA for a full recovery. At this point, all Availability Zones except one have been functioning normally for the past 5 hours. We have stabilized the remaining Availability Zone, but recovery is taking longer than we originally expected. We have been working hard to add the capacity that will enable us to safely re-mirror the stuck volumes. We expect to incrementally recover stuck volumes over the coming hours, but believe it will likely be several more hours until a significant number of volumes fully recover and customers are able to create new EBS-backed instances in the affected Availability Zone. We will be providing more information here as soon as we have it.

Here are a couple of things that customers can do in the short term to work around these problems. Customers having problems contacting EC2 instances or with instances stuck shutting down/stopping can launch a replacement instance without targeting a specific Availability Zone. If you have EBS volumes stuck detaching/attaching and have taken snapshots, you can create new volumes from snapshots in one of the other Availability Zones. Customers with instances and/or volumes that appear to be unavailable should not try to recover them by rebooting, stopping, or detaching, as these actions will not currently work on resources in the affected zone.

10:58 PM PDT Just a short note to let you know that the team continues to be all-hands on deck trying to add capacity to the affected Availability Zone to re-mirror stuck volumes. It’s taking us longer than we anticipated to add capacity to this fleet. When we have an updated ETA or meaningful new update, we will make sure to post it here. But, we can assure you that the team is working this hard and will do so as long as it takes to get this resolved.

Notice the ENTIRE CLOUD has certainly not collapsed. They are providing you a way to spin up instances in many availability zones that are available as usual. These are highly available availability zones which are not affected by this outage and may serve as failover with proper implementation of redundant server architecture.

Here’s a copy of what Amazon Web Services is saying about RDS services in the N. Virginia Region [ RSS ]:

1:48 AM PDT We are currently investigating connectivity and latency issues with RDS database instances in the US-EAST-1 region.
2:16 AM PDT We can confirm connectivity issues impacting RDS database instances across multiple availability zones in the US-EAST-1 region.
3:05 AM PDT We are continuing to see connectivity issues impacting some RDS database instances in multiple availability zones in the US-EAST-1 region. Some Multi AZ failovers are taking longer than expected. We continue to work towards resolution.
4:03 AM PDT We are making progress on failovers for Multi AZ instances and restore access to them. This event is also impacting RDS instance creation times in a single Availability Zone. We continue to work towards the resolution.
5:06 AM PDT IO latency issues have recovered in one of the two impacted Availability Zones in US-EAST-1. We continue to make progress on restoring access and resolving IO latency issues for remaining affected RDS database instances.
6:29 AM PDT We continue to work on restoring access to the affected Multi AZ instances and resolving the IO latency issues impacting RDS instances in the single availability zone.
8:12 AM PDT Despite the continued effort from the team to resolve the issue we have not made any meaningful progress for the affected database instances since the last update. Create and Restore requests for RDS database instances are not succeeding in US-EAST-1 region.
10:35 AM PDT We are making progress on restoring access and IO latencies for affected RDS instances. We recommend that you do not attempt to recover using Reboot or Restore database instance APIs or try to create a new user snapshot for your RDS instance – currently those requests are not being processed.
2:35 PM PDT We have restored access to the majority of RDS Multi AZ instances and continue to work on the remaining affected instances. A single Availability Zone in the US-EAST-1 region continues to experience problems for launching new RDS database instances. All other Availability Zones are operating normally. Customers with snapshots/backups of their instances in the affected Availability zone can restore them into another zone. We recommend that customers do not target a specific Availability Zone when creating or restoring new RDS database instances. We have updated our service to avoid placing any RDS instances in the impaired zone for untargeted requests.

11:42 PM PDT In line with the most recent Amazon EC2 update, we wanted to let you know that the team continues to be all-hands on deck working on the remaining database instances in the single affected Availability Zone. It’s taking us longer than we anticipated. When we have an updated ETA or meaningful new update, we will make sure to post it here. But, we can assure you that the team is working this hard and will do so as long as it takes to get this resolved.

These updates are not direct from Amazon, but merely a copy, so please subscribe to the Amazon Service Health Dashboard for more freshly updated information regarding their service (which I still insist is high quality).

- Asher Bond
It’s a long way down if your head is in the CLOUD.

McDevOps and DevOps Design Strategy

 - by Asher Bond

What is DevOps?

DevOps is the strategic intersection of technical quality assurance, technical operations, and development. In most implementations or attempts to achieve DevOps synergy, Development refers to innovative software engineering, but may also refer to other innovation such as creativity. The DevOps movement got rolling because people with a dedication to productivity became frustrated with tossing packages over the wall of confusion between Development and production operations.

Rajiv Pant's DevOps Diagram

Developers have a stake in the innovation game any time they are hired to be creative, artistic, or re-engineer that which could be improved upon. Those who are paid to do something new often find their interests in conflict with production operations engineers who are paid to operate within a reliable, proven framework. These production operators are members of a highly available infrastructure. DevOps solves this problem by crushing the wall of confusion with the hammer of integration, held by stakeholders in the environment of shared responsibilities. When changes are managed in such a way that there is shared responsibility for concept-proven, tested, quality assurance many benefits emerge. One benefit is that production engineers no longer have to re-hack the developer’s deliverable in a production environment. The production environment becomes adaptive to proven concepts within an ecosystem of collaboration. Collaboration is like competition on steroids, especially in the fields of shared interest where quality is upheld and orchestrated by multiple groups.

Quality assurance can be achieved through collaborative testing. Rolling back to a last known good configuration/implementation is much less necessary when it’s efficient to roll forward to a tested configuration/implementation. This is made possible by technology, of course. Technology is the study, practice, and pursuit of productivity through tools. A technologist seeks to invent tools or teach those around him the way to use a tool or tool-set. In a tightly integrated DevOps core with sustainable gravity, responsibility and tool-sets are shared among authoritative stakeholders in an iterative project. DevOps is much more than just an AGILE cat’s finesse. It’s the Chef’s best. Get Served.

Welcome to McDevOps may I take your order?

McDevOps is the Management of changing DevOps. Configuration management is nothing new or simple, but it becomes artistic without a framework for best practice automation. Here’s why:

Automation can only be profitable in situations where human error introduces significant risk or antiperformance. Antiperformance often exists in situations where a reverse engineer has re-invented a wheel which cannot possibly spin faster. Be advised, however, that it may be considered hasty speculation for a reverse-engineer to paint his or herself out of the Innovation picture in fear of could-be antiperformance. Innovation does indeed require both grit and grid. Let’s not overlook the problem of Semi-automation introducing risk when transparency is lost among service layers. Service-oriented Architecture is not resilient if foundations are simply service level agreements. Solid Service-oriented foundations are built from a lot more grit in the bricks.

Feedback looping, testing, dogfooding, metal and electricity create a fine mix and anything built from it becomes infrastructure, but perhaps only elastic provision can efficiently and effectively automate the process and balance the scale.

The design process never ends, which may seem counter-intuitive to the designer who is unfamiliar with iterative process momentum. Let me assure you that counter-intuition is key to innovation, especially in design practice. Iteration is often the seed of momentum, but many branches require many roots.

Quality may be challenging to control, but design principles make it very possible to contrive. Be aware that in some production environments, a tightly coupled or performance-oriented DevOps core may unfortunately exclude feedback at times, but should create a gravity of production insights from all three inner circles. Collaboration is like competition on steroids. The alternative is advancement along the production curve at the expense of innovative development and best practice designs as deliverables. In order to deliver cutting edge services, a service-oriented architect or Agile DevOps must bleed on the edge.

What framework will be designed as a platform for success in your sphere and how might the wheel spin faster, more efficiently, and more effectively?