Taxonomy of File System (Part 2)


This writeup is a replica subset copy of the presentation of “How to Build a Petabyte Sized Storage System” by Dr. Ray Paden as given in LISA’09. This information is critical for Administrators to make decision on File Systems. For full information do look at “How to Build a Petabyte Sized Storage System”

Taxonomy of File System (Part 1) dealt with 3 file system – Convention I/O, Networked File System, Networked Attached Storage.

4. Basic Clustered File Systems

  1. File access is parallel
    • supports POSIX API, but provides safe parallel file access semantics
  2. File system overhead operations
    • File System overhead operations is distributed and done in parallel
    • No single server bottlenecks ie no metadata servers
  3. Common component architecture
    • commonly configred using seperate file clients and file servers (costs too much to have a seperate storage controller for every node)
    • some file system allow a single component architecture where file clients and file serves are combined (ie no distinction between client and server -> yield very good scalling for async applications)
  4. File System access file data through file servers via the LAN
  5. Example: GPFS, GFS, IBRIX Fusion

5. SAN File Systems

  1. File access in parallel
    • supports POSIX API, but provides parallel file access semantics
  2. File System overhead operations
    • Not done in parallel
    • single metadata with a backup metadata server
    • metadata server is accessed via LAN
    • metadata server is a potential bottleneck, but this is not considered a limitation since these file system are generally used for smaller cluster.
  3. Dual Component Architecture
    • file client/server and metadata server
  4. All disks connected to all file client/server nodes via the SAN, not the LAN
    • file data accessed via the SAN, not the LAN
    • inhibits scaling due to cost of FC SAN
  5. Examples: Stornext, CXFS, QFS

6. Multi-Components File System

  1. File access in parallel
    • Supports POSIX API
  2. File System overhead operations
    • Lustre: metadata server per file system (with backup) accessed via LAB
    • Lustre: potential bottleneck (deploy multiple file systems to avoid backup)
    • Panasas: Director Blade manages protocol
    • Panasas: contains a director blade and 10 disks accessible via Rthernet
    • Pasanas: This provides multiple metadata servers reducing contention
  3. Multi-Component Architecture
    • Lustre: file clients, file servers, metadata servers
    • Panasas: file clients, director blade
    • Panasas: Director Blade encapsulates file service, metadata service,storage controller operations
  4. File clients access file data through file servers or director blades via the LAN
  5. Examples: Lustre, Panasas

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s