samedi 5 avril 2008

A Solution describes the implementation of an Architecture, by defining reusable building blocks.
Four solution categories:
- Product & Services
- Systems Solutions
- Industry Solutions
- Enterprise Solution



An Architecture specifies the structuring of reusable architecture assets, and includes rules, representations and relationships of the information system(s) available to the enterprise.

ANSI/IEEE Standard 1471-2000
Specification of architecture: "the fundamental organization of a system, embodied in its components, their relationships to each other and the environment, and the principles governing its design and evolution."

Architecture is a "formal description of a system, or a detailed plan of the system at component level to guide its implementation", or as "the structure of components, their interrelationships, and the principles and guidelines governing their design and evolution over time."
Building blocks extend the concept of a framework to architect in IT environment. A building block approach helps categorize the components of building an IT architecture into hard, soft, and connector building blocks. Hard building blocks are a combination of software and hardware, which can further be divided into systemic and application tier building blocks. Soft building blocks are software entities like Enterprise Java Beans (EJBs). Connector building blocks are the glue that connects all the components. Building blocks and architectures using building blocks might use one or more architecture patterns.

Enterprise Continuum

This is an important aid to communication and understanding, both within individual enterprises, and between customer enterprises and vendor organizations. Without an understanding of "where in the continuum you are", people discussing architecture can often talk at cross purposes because they are referencing different points in the continuum at the same time, without realizing it.
Quote: "
Not only does the Enterprise Continuum represent an aid to communication, it represents an aid to organizing re-usable architecture and solution assets."


lundi 31 mars 2008

Reliability rules

here's an example of what makes a storage box unreliable. this is a bug found on Centera
that can make the whole cluster totally unresponsive

Symptom The CentraStar software running on Gen4/Gen4LP nodes
may begin to restart frequently 30-90 days after upgrading to
version 3.1.3. The IPMI driver (Intel's Intelligent Platform
Management Interface) is consuming small amounts of memory
over 30-90 days without returning it for use by other more
critical components which will then ultimately run out of
memory and subsequently cause restart of the CentraStar software
running on the node. As a result the node will go off online.

Reliability is what we (customers) are after

samedi 29 mars 2008

Reliability before Performance

Performance is a very important metric when shopping for an IT infrastructure and for storage in particular. Machines are continuously gaining in performance and it is now believed by some that performance offered by most of computing components are above what is required by the applications.
Although performance is always taken into account when choosing a solution, I believe the highest and foremost criterion to success is Reliability.
Vendors, first build a reliable centric solution, then add performance to it, thanks!

lundi 3 mars 2008

There are two choices to benchmark IO-stack (storage/OS/filesystem) against a given application:
  • Install the application
  • Install an application simulator
Installing the application can be complicated...for complicated application.
On the other hand, running an application simulator is great, because it's quicker to install and run.
As I wanted to find the best filesystem to run a mail server, I stumbled on this great mailstore simulator
I was able to run numerous test runs, before deploying my application.

other interesting simulator I used:
- filebench
- Netapp Simulator
- vdbench
- iometer
- slamd
- mailstore simulator

vendredi 11 janvier 2008

Information Technology Data Cycle

I. Storage holds Data
In the same way that carbon dioxide is captured and becomes part of natural resources, Business data is captured and becomes part of storage systems.

Data can be seen as a non-wanted result of business activity: besides the fact that primary goal of a business is to make money it also results in the production and proliferation of data ("data emission"), which, following to the "Information Technology Data Cycle" (ITDC), is captured in storage systems. In the same way a musician will create musique from noise, a businesse will create Information from accumulated data

One way of measuring a business efficiency, is Return-On-Investment calculation: assessing the value created from the investments. One way of adding value into a business is to create new products and services out of various sources like, for example, published market intelligence reports or analysis of historical data on sales and customer behavior. In other words, a business creates many financial reports, like sales performance, consumer spending trends or other statistics, so to make a new meaning from all data accumulated over the course of its activity. Finally, every data being potentially a valuable asset, a corporate must preserve it all.

Without a doubt, every business emit data, all of which must be kept, as it can potentially generate extra value by extracting Information out of it. On the other hand, excess of information can lead to inefficient decision making, hence a loss of business agility.

In the same manner as energy, Storage is a limited resource. We propose an approach to storage management that is as natural as managing energy nowadays: invest in efficient technologies in-line with defined targets

II. Storage types

2) Business critical storage: Tier-1
no consumable parts
short life time (4-5 years)
forklift upgrade
high electrical power
high cooling power
Goal: running business critical services

High-End spec storage for business critical applications. Transactional workloads is prime targets. No-downtime allowed. Made out of specialized storage controllers, fully redundant and maintained without disruption.
Production lifetime is 4 to 5 years on average. Hardware upgrades are forklift, due to specialized storage controllers. High manufacturing quality control

2) General Purpose Storage: Tier-2
High performance, high available storage repositories aiming mainly at non business critical applications, while offering storage for transactional workloads.
Build out of general purpose components, in order to maintain a low price/GB.
price is up to 1/3 of Tier-1, for same performance. Expect possible downtime during some maintenance windows. Quality control is lesser than on Tier-1
Lifetime is equivalent to Tier-1 (4-5 years on average). Hardware upgrades are components basis (modular upgrades)

3) Capacity oriented storage: Tier-3
Made partly from consumable parts (such as tapes/cartridges..)
long life time (up to 20 years)
very low power requirements
very low heat emission

Mixed storage systems: tapes+disks, in order to propose very large storage repository while maintaining performance in-line with expectations. Price/GB half of Tier-2.
High upgrade level (software and hardware) and long production life expectancy: 4 to 5 times longer than Tier-2 and Tier-3. Migration is is an important process and must be made possible without vendor lock-in (adoption to storage standards is mandatory, opensource is a guarantee). This tier is aiming at offering very large storage for both long term storage and large filesystems storage, while making sure customer controls its data

III. Storage Performance Characteristics


IV. Storage Efficiency Characteristics

Price per unit of storage ($/GB)
Watts per unit of storage (W/GB)
FRU per unit of storage (#FRU/TB)
Lifetime per unit of storage
Maximum/minimum storage capacity
Usable vs Raw storage ratio

jeudi 10 janvier 2008

Open Doors!

Vendor: really you should install our latest remote console's win-win for you and us
Me: we can, but only to send alerts , we don't want remote control over our infrastructure
Vendor: in that case we aren't interested in installing it. You know it's a very secure console, with VPN, encryption, strong authentication, CERT compliant, you really don't have to worry about security
Me: (here we go again with remote monitoring)...we cannot allow external people to take control over our infrastructure and risk downtime, that's as simple as that
Vendor: Listen, last day we received an alarm for a drive failure at a customer site. I got connected to the box to assess the state of it. That allowed me to check the reconstruct was taking place correctly and I even corrected the alerting plus one or two wrongly set parameters. You see it's very useful for you, we can take proactive actions, isn't that great??!!
Me: (is this guy dumb or what I wonder)....That is exactly what we cannot accept: we cannot allow that a guy, connects to our boxes to make corrective actions...can you understand this? You might correct things, but you can potential disrupt things. I tell you what we are going to do: if something breaks, you receive an alarm, you take your car and you come right away to fix the problem under our supervision; what do you think about that? isn't that what our $1M contract says anyway?

Do you think we are a fast-food?

As a customer, what kind of remote monitoring solution would fit my IT governance ?

1) One way remote monitoring
a remote monitoring that is only able to send reports and alarms to my vendor is acceptable system. I must make sure that only authorized data is sent over (non-confidential data is defined by IT security group, in accordance to IT Strategy&Principle policy)

2) Two way remote monitoring
if vendor puts in place a mechanism whereby we authorize him to
a) Only connect to the faulty component
b) troubleshoot only the concerned fault
The system in place must also record every actions done by storing securely (write-once) all the audit logs

mardi 8 janvier 2008

Building Blocks to Develop

1. Risk Management in SAN
    • How to detect and measure risk in SAN configuration
2. Change Management in SAN
    • How to track and simulate changes in a SAN
3. Storage Provisioning Process
    • 3 steps approach to storage provisionning: Reservation, Configuration, Allocation
4. Storage Tiering Definition
    • Storage tiering definition varies between a vendor and a customer perspective

jeudi 3 janvier 2008

Storage Architecture: Table of Contents

Table of Contents
  • Introduction
  • Related Documents
  • Strategies & Principles
    • Information Technology Principles
    • Storage Technology Principles
    • Overview of a Software Architecture
  • Storage Architecture Building Blocks
    • Media Types
    • Storage Tiering
    • Topology
      • Internal Disk
      • DAS: Direct Attached Storage
      • SAN: Storage Area Network
      • NAS: Network Attached Storage
      • Consolidated Storage
      • Physical Isolation
      • Logical Isolation
      • Virtualisation
      • Storage Transport Technology
      • SAN High Level Architecture
    • Data Protection
      • Replication
      • Mirroring
      • Backup
    • Archiving
      • General Purpose Archiving
      • Compliant Archiving
    • Storage Security Principles
    • Storage Provisioning Principles
    • Storage Management Principles
    • Storage Usage patterns
    • Tiering Selection Principles
    • Naming convention for Storage
    • Storage as a Service
    • Storage Services portfolio
    • Storage Cost Center
  • Summary of Design