|
Sort task description
The problems of storing, securing and
providing accessibility of the continuously expanding data generated
by distributed users environment using nationwide computational
resources are among the most important issues concerning the SGI
cluster utilization. For the needs of the national cluster a natural
solution would be to create a data management hierarchical meta-system
(MHSM - Meta Hierarchical Storage Management) that would work
basing on local HSM systems that currently exist in Gdańsk, Kraków,
Poznań and Wrocław and are planned to be established in Łódż and
Warszawa (IMGW). Work relating to the MHSM system creation will
concentrate on the following issues:
- Functional development of existing local HSM systems
- Meta database (MDB) for distributed data localization and a common
API interface for local HSM systems
- Integration, testing and improvements
Functional development of local HSM systems
consists in creating a subsystem for fast access to huge tape drive files
and a subsystem calculating file access time. The aim of the first sub-task
is to design and implement a subsystem assuring fast access to huge files
stored on the magnetic tape drives, using the file division strategy. For
this strategy to be realized, a subsystem residing between a client
application and an HSM system must be designed. During the process of
writing the files onto a tape this subsystem will split them into pieces,
transparently for the user. Information about files fragmentation will be
stored in an index database. To keep the data transmission rate constant
and make the transmission itself shorter, prefetching will be applied
additionally. The prefetching method consists in fetching the subsequent
subfile to the system cache simultaneously to the transmission of the
previous one from the system cache to the client application. The aim of
the second sub-task is to design and implement a subsystem able to answer
what is the access time to a file from the given HSM. The answer depends
on many factors like HSM load, queue length, number and throughput of the
drives, file size and so on.
|