IBM z13 Systems Design Film


Ross Mauri: z13 is our new mainframe. It’s
been designed from the ground up to support enormous scale and data and transactions to
support the mobile generation. Jeff Frey: When we design the system, right,
we always make sure that we think about introducing new technologies in a way that will address
the requirements that our clients have, while at the same time positioning the platform
for what’s next. Jeff Frey: So whether it’s Cloud computing
or analytics or new business models or new uses of IT, the introduction of mobile, the
internet of things, are all driving a set of capabilities into our platform that require
us to … keep ahead of that … as the industry moves forward. Eldee Stephens: Several thousand IBMers pour
their heart and soul into every single machine that we ship out the door. Uh it takes years
of efforts and hundreds of millions of or even billions of dollars of research money
to produce. But in the end every machine that we ship by definition, is the most reliable,
the most secure and the most performant commercial system available in the market. Jeff Frey: The new z13 is packed full of technology
innovation Jeff Frey: 22 nanometer cores, mega memory,
dynamic multithreading in each of the cores, advance vector arithmetic processing known
as SIMD, FPGAs for fit for purpose acceleration of the workload, enterprise data compression
both on the core and through accelerator cards, new advances in next generation virtualization.
And state of the art cryptology with the new Crypto Express 5s card. All coming together
to deliver industry leading scale and performance. Jeff Frey: We have always focused on a balanced
system design. It is not all just about processor performance. You have to feed the beast. Jeff Frey: You get that from a combination
of a balanced design between the processor, the memory, the cache structure, which are
very important to providing our performance and scale and internal bus bandwidths and
access to external I/O storage and network Tarun Chopra: It’s not about a single technology.
It’s about combining all these innovations. We’re really pushing the system performance
boundaries you know combining all these features and functions Charles Webb: The system level, we have up
to 141 configurable processors. Inside the processor, we can execute up to six instructions
at a time, double the number from the previous processors. Jeff Frey: z13 offers double digit performance
improvement in single thread performance and up to 40% more capacity in the same environmental
footprint. Eldee Stephens: We had to solve some pretty
cool technical challenges such as this particular processor, the amount of on die cache, cache
memory on the chip, has more than doubled Eldee Stephens: So in a single frame you’ve
got more than four gigabytes of cache memory. We certainly have a lot more cores on the
chips themselves, there’s eight cores per die Charles Webb: For us to share that cache among
the eight cores on the chip, drives tremendous wiring requirements to get the request from
the chip. Charles Webb: Just imagine on a chip this
size there’s 13 miles of wire and almost four billion transistors. Jeff Frey: With z13 we’ve put a full 10
terabytes of main memory in the system. In addition to that we’ve doubled the caching
structures to allow high performance feeding of the engines. Eldee Stephens: By increasing the amount of
memory bandwidth then aggregate memory on the machine, that does a lot of things. First
of all, for transaction processing it means response times are better. Your latency is
a lot less, they’re able to consolidate more and more physical servers into a single
image Jeff Frey: Because we have such a large pool
of resources and manage those resources effectively, we can allocate those resources across the
workload and keep utilization up and drive scalability on a platform like no other. Jeff Frey: With z13, we have the ability to
scale up to 8,000 virtual machines on a single physical system. Jeff Frey: I think the best-kept secret of
the z System is its I/O processing capability. Jeff Frey: And I would characterize this system
as an I/O monster. Eldee Stephens: We have customers with thousands
upon thousands of I/O devices, tens of thousands, hundreds of thousands of I/O devices, so I/O
bandwidth is incredibly important for them. Eldee Stephens: We have FICON Express 16S
for better performance for their I/O. You’ve got a hundred and sixty I/O cards just a little
bit more than your average server. Um as you might imagine that’s three hundred and twenty
Ficon channels available to the customer Jeff Frey: Each one of these channels has
two power PC cores on it. We are talking about potentially 600, 700 cores worth of processing
power just to drive I/O. Jeff Frey: That allows the core processors
to do more work and that is part of what makes our system so efficient and provides us the
ability to drive utilizations up to near 100%, where other platforms are stuck at 30% or
40% because a single core of the processor has to do all of that work. Eldee Stephens: There is simply no machine
on earth that has in a single frame the amount of I/O bandwidth and capability that z does
– none. Jeff Frey: Additional intelligence in our
IO subsystem allows us to route around congestion in the fabric and even route around error
conditions without ever effecting the application. Jeff Frey: We’ve introduced two key new
technologies aimed specifically at improving performance and scalability of our system.
Jeff Frey: SIMD, which is Single Instruction, Multiple Data, introduces vector processing
in the platform to dramatically improve and increase the performance of computationally
intensive workloads. Eldee Stephens: It enables us to do mathematic
operations on much larger data sets than they might be able to before. Jeff Frey: We’ve also introduced the capability,
simultaneously multithreading, which provides two threads of execution in the core. Jeff Frey: SMT will be enabled for certain
types of workloads that can take advantage of the parallelism Jeff Frey: We expect to get up to forty percent
additional increased throughput and capacity out of a single core, right, for those types
of workloads. Jeff Frey: z13 is an open system. Examples
include Linux on our platform, Java, JSON, many of the languages and development environments
and scripting languages that are pervasive now and building new applications such as
mobile applications. Kelly Ryan: If you have Java skills, if you
have Linux skills, if you can do application development out in the cloud where the service
is, that can be transparent going back to the z13. Kelly Ryan: We have features and functions
like zOS connect. zOS connect takes a state of the art application development system
like IBM’s Bluemix and connects it to the back end system of record through a set of
standard APIs. Tarun Chopra: we have industry leadership
performance on java in the previous generation of platforms as well. On z13 we are enabling
SIMD instructions in the java compiler itself, right. So wherever customers can get the benefit
on the applications leveraging the just in time java compile. Jeff Frey: With this new platform, we’re going
to introduce state of the art next generation virtualization with standard KVM and standard
Openstack base management tooling to manage the virtual environment. Eldee Stephens: And it’s not just at the
software virtualization level it’s also at the PRSM level, the actual management layer
that manages the logical partitions on the machine that has also been enabled with drivers
for the open stack management tools. Jeff Frey: With the new system we’re taking
our world class multi-site disaster recovery capability, our dispersed sysplex capability
and making that available to Linux only environments. Jeff Frey: The z13 platform delivers integrated
intelligence. A great example of intelligence in our system is our workload management capability. Jeff Frey: So instead of having system programmers
and operational staff tweak all kind of knobs and dials to try and get the system tuned.
Our system dynamically adjusts and makes those optimizations in real time based on its understanding
of the workloads. Jeff Frey: In addition to that, you can apply
intelligence so that you can make the system more resilient Jeff Frey: Our z aware capability is a major
advancement in resiliency analytics. It basically allows us to be able to spot potential problems
in the system before they actually occur. And now this capability is available for our
Linux environments, as well. Jeff Frey: We’ve engineered z13 to provide
the highest levels of security available. Eldee Stephens: So with z13 we’re shipping
our next generation cryptographic co-processing technology and this enables people to handle
a tremendous number of cryptographic keys and secure their data in real time across
tremendous numbers of sources. Additionally, security goes into every system that we make.
We’re the only system in the commercial world that is EAL 5+ compliant. Eysha Powers: What’s nice about encryption
on z is we have so many different levels of protection for both your keys and your data. Eysha Powers: So, we give you the ability
to do encrypt, decrypt, sign, verify. We also have master key management. So, we have this
like big data sets that we can store, you know, millions of keys that you can use in
this crypto applications Eysha Powers: You have the data encrypted
kind of end to end so that essentially at no point along this transaction can anyone
actually get to the data without the proper key. Eysha Powers: One of my favorite features
that’s going into the next z system will be format preserving encryption. I think it’s
a pretty awesome technology. You can actually encrypt the data in place. So, it gives you
the ability where you can actually do encryption but not have to change your existing systems. Bruce Hill: My job as part of the Engineering
System Test Team is to try to break the mainframe but we’re doing that towards making it a
better and more resilient system for the customer. Bruce Hill: So in our thermal testing, what
we try to do is take the machine up to higher temperature than would ever been seen in a
customer data center and we’ll take the voltage and frequency and vary those within
the test as well so now you’ve got kind of a worst case scenario where you have high
temperature, high voltage, low frequency and will the processor still stay up? Will the
machine still run? Eldee Stephens: We have to make certain that
no matter whether you’ve deployed your Z in the middle of the Mojave Desert or you
decided to do so somewhere in Alaska, these systems are able to survive in almost any
environment so that their customers, no matter what happens, within the data center itself
that their customers never see a blip, Bruce Hill: We do shock and vibration testing,
we do tilt test where we actually tilt the machine over and try to see what happens when
that happens, we also do earthquake testing as part of our compliance testing
Patrick Laplace: Here in this laboratory, we can run any type of earthquake motion that
has existed or has been measured. Patrick Laplace: If an IBM mainframe can pass
these motions, then it can be qualified for all the types of earthquakes that we may see
in all the types of buildings that the mainframes may be in.
Patrick Laplace: IBM isn’t interested in just pass/fail. They’re interested in every
little detail inside that mainframe. And we go above and beyond testing with IBM because
they want to take their equipment above and beyond what the code requires.
Bruce Hill: The other part of the test area is primarily trying to emulate customer workloads
much higher than even any typical customer would run. Bruce Hill: We can’t even do that if we
are just running the typical kernel or software like Linus or z/OS. It is almost impossible
for us to stack it enough to break it so we use our own kernel to do that. We also inject
failures into the machine. For instance, we take one of the cooling units down out of
the system. It can automatically sense that the cooling unit has gone down, it will dial
down the processor if it needs to slow it down if it is getting too hot. It will make
a call and say…I’ve got a bad cooling unit- please come fix me. And then it will
turn up the fan speeds so it automatically knows this is what I need to do to stay cool
enough to keep running at the highest level I can. Kelly Ryan: That’s what high availability
is about. We have 24/7 support for our clients whether it’s an operating system or the
hardware. We have many, many different types of error checking built across the system
to prevent the system from ever having a problem. Jeff Frey: Our commitment to continuous, reliable
operations extends to the delivery, the deployment, upgrade, and service of our systems for our
clients. Kelly Ryan: The design of the system is such
that you can upgrade that in eight hours or less. We can transparently do that, move the
work load, roll a new system in, have the new system up and running, move the work load
back and the systems running. John Torok: For z13 you’ll see on the front
covers that we’ve carried forward with the geometric shape. It’s a real cool look.
And the covers themselves they offer tremendous functionality from the standpoint of airflow
management as well as acoustic attenuation. John Torok: As we get very dense and from
a footprint standpoint there is a fair amount of heat generated. There’s a fair amount
of airflow that takes place John Torok: To make that more efficient we’ve
actually redesigned the rear covers for our system. And we’ve allowed those to have
the airflow vectorized. They’ll either vector up or vector down…. So we’ve actually
designed the rear cover to have that flexibility to be configured on site.
Jeff Frey: The new z13 pushes the boundaries of computing so that businesses can innovate
without constraint. Eldee Stephens: The people who work on z absolutely
love what they do and they absolutely love the machines that we build and we’re always
very excited to see what our customers are capable of coming up with when they use it. Eysha Powers: I’m Eysha Powers. I work on
mainframes. Eldee Stephens: I’m Eldee Stephens and I
work on mainframes. Tarun Chopra: I’m Tarun Chopra and I work
on mainframes. Kelly Ryan: I’m Kelly Ryan and I work on
mainframes. John Torok: I’m John Torok and I work on
mainframes. Bruce Hill: I’m Bruce Hill and I work on
mainframes.

74 Comments

Add a Comment

Your email address will not be published. Required fields are marked *