s3fs-fuse/doc/man/s3fs.1

224 lines
12 KiB
Groff
Raw Normal View History

.TH S3FS "1" "February 2011" "S3FS" "User Commands"
.SH NAME
S3FS \- FUSE-based file system backed by Amazon S3
.SH SYNOPSIS
.SS mounting
.TP
\fBs3fs bucket[:/path] mountpoint \fP [options]
.SS unmounting
.TP
\fBumount mountpoint
.SS utility mode ( remove interrupted multipart uploading objects )
.TP
2014-10-01 10:20:29 +00:00
\fBs3fs \-u bucket
.SH DESCRIPTION
s3fs is a FUSE filesystem that allows you to mount an Amazon S3 bucket as a local filesystem. It stores files natively and transparently in S3 (i.e., you can use other programs to access the same files).
.SH AUTHENTICATION
The s3fs password file has this format (use this format if you have only one set of credentials):
.RS 4
\fBaccessKeyId\fP:\fBsecretAccessKey\fP
.RE
If you have more than one set of credentials, this syntax is also recognized:
.RS 4
\fBbucketName\fP:\fBaccessKeyId\fP:\fBsecretAccessKey\fP
.RE
.PP
Password files can be stored in two locations:
.RS 4
\fB/etc/passwd-s3fs\fP [0640]
\fB$HOME/.passwd-s3fs\fP [0600]
.RE
.SH OPTIONS
.SS "general options"
.TP
\fB\-h\fR \fB\-\-help\fR
print help
.TP
\fB\ \fR \fB\-\-version\fR
print version
.TP
\fB\-f\fR
FUSE foreground option - do not run as daemon.
.TP
\fB\-s\fR
FUSE singlethreaded option (disables multi-threaded operation)
.SS "mount options"
.TP
All s3fs options must given in the form where "opt" is:
<option_name>=<option_value>
.TP
\fB\-o\fR default_acl (default="private")
the default canned acl to apply to all written S3 objects, e.g., "public-read".
Any created files will have this canned acl.
Any updated files will also have this canned acl applied!
.TP
\fB\-o\fR retries (default="2")
number of times to retry a failed S3 transaction.
.TP
\fB\-o\fR use_cache (default="" which means disabled)
local folder to use for local file cache.
.TP
Changes codes for performance(part 3) * Summay This revision includes big change about temporary file and local cache file. By this big change, s3fs works with good performance when s3fs opens/ closes/syncs/reads object. I made a big change about the handling about temporary file and local cache file to do this implementation. * Detail 1) About temporary file(local file) s3fs uses a temporary file on local file system when s3fs does download/ upload/open/seek object on S3. After this revision, s3fs calls ftruncate() function when s3fs makes the temporary file. In this way s3fs can set a file size of precisely length without downloading. (Notice - ftruncate function is for XSI-compliant systems, so that possibly you have a problem on non-XSI-compliant systems.) By this change, s3fs can download a part of a object by requesting with "Range" http header. It seems like downloading by each block unit. The default block(part) size is 50MB, it is caused the result which is default parallel requests count(5) by default multipart upload size(10MB). If you need to change this block size, you can change by new option "fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes. So that, you have to take care about that fdcache.cpp(and fdcache.h) were changed a lot. 2) About local cache Local cache files which are in directory specified by "use_cache" option do not have always all of object data. This cause is that s3fs uses ftruncate function and reads(writes) each block unit of a temporary file. s3fs manages each block unit's status which are "downloaded area" or "not". For this status, s3fs makes new temporary file in cache directory which is specified by "use_cache" option. This status files is in a directory which is named "<use_cache sirectory>/.<bucket_name>/". When s3fs opens this status file, s3fs locks this file for exclusive control by calling flock function. You need to take care about this, the status files can not be laid on network drive(like NFS). This revision changes about file open mode, s3fs always opens a local cache file and each status file with writable mode. Last, this revision adds new option "del_cache", this option means that s3fs deletes all local cache file when s3fs starts and exits. 3) Uploading When s3fs writes data to file descriptor through FUSE request, old s3fs revision downloads all of the object. But new revision does not download all, it downloads only small percial area(some block units) including writing data area. And when s3fs closes or flushes the file descriptor, s3fs downloads other area which is not downloaded from server. After that, s3fs uploads all of data. Already r456 revision has parallel upload function, then this revision with r456 and r457 are very big change for performance. 4) Downloading By changing a temporary file and a local cache file, when s3fs downloads a object, it downloads only the required range(some block units). And s3fs downloads units by parallel GET request, it is same as a case of uploading. (Maximum parallel request count and each download size are specified same parameters for uploading.) In the new revision, when s3fs opens file, s3fs returns file descriptor soon. Because s3fs only opens(makes) the file descriptor with no downloading data. And when s3fs reads a data, s3fs downloads only some block unit including specified area. This result is good for performance. 5) Changes option name The option "parallel_upload" which added at r456 is changed to new option name as "parallel_count". This reason is this option value is not only used by uploading object, but a uploading object also uses this option. (For a while, you can use old option name "parallel_upload" for compatibility.) git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
\fB\-o\fR del_cache - delete local file cache
delete local file cache when s3fs starts and exits.
.TP
\fB\-o\fR storage_class (default is standard)
store object with specified storage class.
this option replaces the old option use_rrs.
Possible values: standard, standard_ia, and reduced_redundancy.
.TP
\fB\-o\fR use_rrs (default is disable)
use Amazon's Reduced Redundancy Storage.
this option can not be specified with use_sse.
(can specify use_rrs=1 for old version)
this option has been replaced by new storage_class option.
.TP
\fB\-o\fR use_sse (default is disable)
2014-10-01 10:20:29 +00:00
use Amazon's Server-Site Encryption or Server-Side Encryption with Customer-Provided Encryption Keys.
2014-07-19 19:02:55 +00:00
this option can not be specified with use_rrs. specifying only "use_sse" or "use_sse=1" enables Server-Side Encryption.(use_sse=1 for old version)
specifying this option with file path which has some SSE-C secret key enables Server-Side Encryption with Customer-Provided Encryption Keys.(use_sse=file)
2014-08-26 17:11:10 +00:00
the file must be 600 permission. the file can have some lines, each line is one SSE-C key. the first line in file is used as Customer-Provided Encryption Keys for uploading and change headers etc.
2014-07-19 19:02:55 +00:00
if there are some keys after first line, those are used downloading object which are encripted by not first key.
so that, you can keep all SSE-C keys in file, that is SSE-C key history.
2014-08-26 17:11:10 +00:00
if AWSSSECKEYS environment is set, you can set SSE-C key instead of this option.
.TP
\fB\-o\fR passwd_file (default="")
specify the path to the password file, which which takes precedence over the password in $HOME/.passwd-s3fs and /etc/passwd-s3fs
.TP
Fixed Issue 229 and Changes codes 1) Set metadata "Content-Encoding" automatically(Issue 292) For this issue, s3fs is added new option "ahbe_conf". New option means the configuration file path, and this file specifies additional HTTP header by file(object) extension. Thus you can specify any HTTP header for each object by extension. * ahbe_conf file format: ----------- line = [file suffix] HTTP-header [HTTP-header-values] file suffix = file(object) suffix, if this field is empty, it means "*"(all object). HTTP-header = additional HTTP header name HTTP-header-values = additional HTTP header value ----------- * Example: ----------- .gz Content-Encoding gzip .Z Content-Encoding compress X-S3FS-MYHTTPHEAD myvalue ----------- A sample configuration file is uploaded in "test" directory. If ahbe_conf parameter is specified, s3fs loads it's configuration and compares extension(suffix) of object(file) when uploading (PUT/POST) it. If the extension is same, s3fs adds/sends specified HTTP header and value. A case of sample configuration file, if a object(it's extension is ".gz") which already has Content-Encoding HTTP header is renamed to ".txt" extension, s3fs does not set Content-Encoding. Because ".txt" is not match any line in configuration file. So, s3fs matches the extension by each PUT/POST action. * Please take care about "Content-Encoding". This new option allows setting ANY HTTP header by object extension. For example, you can specify "Content-Encoding" for ".gz"/etc extension in configuration. But this means that S3 always returns "Content-Encoding: gzip" when a client requests with other "Accept-Encoding:" header. It SHOULD NOT be good. Please see RFC 2616. 2) Changes about allow_other/uid/gid option for mount point I reviewed about mount point permission and allow_other/uid/gid options, and found bugs about these. s3fs is fixed bugs and changed to the following specifications. * s3fs only allows uid(gid) options as 0(root), when the effective user is zero(root). * A mount point(directory) must have a permission to allow accessing by effective user/group. * If allow_other option is specified, the mount point permission is set 0777(all users allow all access). In another case, the mount point is set 0700(only allows effective user). * When uid/gid option is specified, the mount point owner/group is set uid/gid option value. If uid/gid is not set, it is set effective user/group id. This changes maybe fixes some issue(321, 338). 3) Changes a logic about (Issue 229) The chmod command returns -EIO when changing the mount point. It is correct, s3fs can not changed owner/group/mtime for the mount point, but s3fs sends a request for changing the bucket. This revision does not send the request, and returns EIO as soon as possible. git-svn-id: http://s3fs.googlecode.com/svn/trunk@465 df820570-a93a-0410-bd06-b72b767a4274
2013-08-16 19:24:01 +00:00
\fB\-o\fR ahbe_conf (default="" which means disabled)
This option specifies the configuration file path which file is the additional HTTP header by file(object) extension.
The configuration file format is below:
-----------
line = [file suffix] HTTP-header [HTTP-values]
file suffix = file(object) suffix, if this field is empty, it means "*"(all object).
HTTP-header = additional HTTP header name
HTTP-values = additional HTTP header value
-----------
Sample:
-----------
.gz Content-Encoding gzip
.Z Content-Encoding compress
X-S3FS-MYHTTPHEAD myvalue
-----------
A sample configuration file is uploaded in "test" directory.
If you specify this option for set "Content-Encoding" HTTP header, please take care for RFC 2616.
.TP
\fB\-o\fR public_bucket (default="" which means disabled)
anonymously mount a public bucket when set to 1, ignores the $HOME/.passwd-s3fs and /etc/passwd-s3fs files.
.TP
\fB\-o\fR connect_timeout (default="300" seconds)
time to wait for connection before giving up.
.TP
\fB\-o\fR readwrite_timeout (default="60" seconds)
time to wait between read/write activity before giving up.
.TP
\fB\-o\fR max_stat_cache_size (default="1000" entries (about 4MB))
maximum number of entries in the stat cache
.TP
Summary of Changes(1.62 -> 1.63) 1) Lifetime for the stats cache Added the new option "stat_cache_expire". This option which is specified by seconds means the lifetime for each stats cache entry. If this option is not specified, the stats cache is kept in s3fs process until the stats cache grown to maximum size. (default) If this option is specified, the stats cache entry is out from the memory when the entry expires time. 2) Enable file permission s3fs before 1.62 did not consider the file access permission. s3fs after this version can consider it. For access permission, the s3fs_getattr() function was divided into sub function which can check the file access permission. It is like access() function. And the function calling the s3fs_getattr() calls this new sub function instead of s3fs_getattr(). Last the s3fs_opendir() function which is called by FUSE was added for checking directory access permission when listing the files in directory. 3) UID/GUID When a file or a directory was created, the s3fs could not set the UID/GID as the user who executed a command. (Almost the UID/GID are root, because the s3fs run by root.) After this version, the s3fs set correct UID/GID as the user who executes the commond. 4) About the mtime If the object does not have "x-amz-meta-mtime" meta, the s3fs uses the "Last-Modified" header instead of it. But the s3fs had a bug in this code, and this version fixed this bug. When user modified the file, the s3fs did not update the mtime of the file. This version fixed this bug. In the get_local_fd() function, the local file's mtime was changed only when s3fs run with "use_cache" option. This version always updates the mtime whether the local cache file is used or not. And s3fs_flush ( ) function set the mtime of local cache file from S3 object mtime, but it was wrong . This version is that the s3fs_flush ( ) changes the mtime of S3 object from the local cache file or the tmpfile . The s3fs cuts some requests, because the s3fs can always check mtime whether the s3fs uses or does not use the local cache file. 5) A case of no "x-amz-meta-mode" If the object did not have "x-amz-meta-mtime" mete, the s3fs recognized the file as not regular file. After this version, the s3fs recognizes the file as regular file. 6) "." and ".." directory The s3fs_readdir() did not return "X" and "XX" directory name. After this version, the s3fs is changed that it returns "X" and "XX". Example, the result of "ls" lists "X" and "XX" directory. 7) Fixed a bug The insert_object() had a bug, and it is fixed. git-svn-id: http://s3fs.googlecode.com/svn/trunk@390 df820570-a93a-0410-bd06-b72b767a4274
2013-02-24 08:58:54 +00:00
\fB\-o\fR stat_cache_expire (default is no expire)
specify expire time(seconds) for entries in the stat cache
.TP
\fB\-o\fR enable_noobj_cache (default is disable)
enable cache entries for the object which does not exist.
s3fs always has to check whether file(or sub directory) exists under object(path) when s3fs does some command, since s3fs has recognized a directory which does not exist and has files or sub directories under itself.
It increases ListBucket request and makes performance bad.
You can specify this option for performance, s3fs memorizes in stat cache that the object(file or directory) does not exist.
.TP
\fB\-o\fR no_check_certificate (by default this option is disabled)
do not check ssl certificate.
server certificate won't be checked against the available certificate authorities.
.TP
\fB\-o\fR nodnscache - disable dns cache.
s3fs is always using dns cache, this option make dns cache disable.
.TP
\fB\-o\fR nosscache - disable ssl session cache.
s3fs is always using ssl session cache, this option make ssl session cache disable.
.TP
\fB\-o\fR multireq_max (default="20")
maximum number of parallel request for listing objects.
.TP
Changes codes for performance(part 3) * Summay This revision includes big change about temporary file and local cache file. By this big change, s3fs works with good performance when s3fs opens/ closes/syncs/reads object. I made a big change about the handling about temporary file and local cache file to do this implementation. * Detail 1) About temporary file(local file) s3fs uses a temporary file on local file system when s3fs does download/ upload/open/seek object on S3. After this revision, s3fs calls ftruncate() function when s3fs makes the temporary file. In this way s3fs can set a file size of precisely length without downloading. (Notice - ftruncate function is for XSI-compliant systems, so that possibly you have a problem on non-XSI-compliant systems.) By this change, s3fs can download a part of a object by requesting with "Range" http header. It seems like downloading by each block unit. The default block(part) size is 50MB, it is caused the result which is default parallel requests count(5) by default multipart upload size(10MB). If you need to change this block size, you can change by new option "fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes. So that, you have to take care about that fdcache.cpp(and fdcache.h) were changed a lot. 2) About local cache Local cache files which are in directory specified by "use_cache" option do not have always all of object data. This cause is that s3fs uses ftruncate function and reads(writes) each block unit of a temporary file. s3fs manages each block unit's status which are "downloaded area" or "not". For this status, s3fs makes new temporary file in cache directory which is specified by "use_cache" option. This status files is in a directory which is named "<use_cache sirectory>/.<bucket_name>/". When s3fs opens this status file, s3fs locks this file for exclusive control by calling flock function. You need to take care about this, the status files can not be laid on network drive(like NFS). This revision changes about file open mode, s3fs always opens a local cache file and each status file with writable mode. Last, this revision adds new option "del_cache", this option means that s3fs deletes all local cache file when s3fs starts and exits. 3) Uploading When s3fs writes data to file descriptor through FUSE request, old s3fs revision downloads all of the object. But new revision does not download all, it downloads only small percial area(some block units) including writing data area. And when s3fs closes or flushes the file descriptor, s3fs downloads other area which is not downloaded from server. After that, s3fs uploads all of data. Already r456 revision has parallel upload function, then this revision with r456 and r457 are very big change for performance. 4) Downloading By changing a temporary file and a local cache file, when s3fs downloads a object, it downloads only the required range(some block units). And s3fs downloads units by parallel GET request, it is same as a case of uploading. (Maximum parallel request count and each download size are specified same parameters for uploading.) In the new revision, when s3fs opens file, s3fs returns file descriptor soon. Because s3fs only opens(makes) the file descriptor with no downloading data. And when s3fs reads a data, s3fs downloads only some block unit including specified area. This result is good for performance. 5) Changes option name The option "parallel_upload" which added at r456 is changed to new option name as "parallel_count". This reason is this option value is not only used by uploading object, but a uploading object also uses this option. (For a while, you can use old option name "parallel_upload" for compatibility.) git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
\fB\-o\fR parallel_count (default="5")
number of parallel request for uploading big objects.
2014-03-30 07:53:41 +00:00
s3fs uploads large object(default:over 20MB) by multipart post request, and sends parallel requests.
This option limits parallel request count which s3fs requests at once.
It is necessary to set this value depending on a CPU and a network band.
2014-03-30 07:53:41 +00:00
This option is lated to fd_page_size option and affects it.
.TP
Changes codes for performance(part 3) * Summay This revision includes big change about temporary file and local cache file. By this big change, s3fs works with good performance when s3fs opens/ closes/syncs/reads object. I made a big change about the handling about temporary file and local cache file to do this implementation. * Detail 1) About temporary file(local file) s3fs uses a temporary file on local file system when s3fs does download/ upload/open/seek object on S3. After this revision, s3fs calls ftruncate() function when s3fs makes the temporary file. In this way s3fs can set a file size of precisely length without downloading. (Notice - ftruncate function is for XSI-compliant systems, so that possibly you have a problem on non-XSI-compliant systems.) By this change, s3fs can download a part of a object by requesting with "Range" http header. It seems like downloading by each block unit. The default block(part) size is 50MB, it is caused the result which is default parallel requests count(5) by default multipart upload size(10MB). If you need to change this block size, you can change by new option "fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes. So that, you have to take care about that fdcache.cpp(and fdcache.h) were changed a lot. 2) About local cache Local cache files which are in directory specified by "use_cache" option do not have always all of object data. This cause is that s3fs uses ftruncate function and reads(writes) each block unit of a temporary file. s3fs manages each block unit's status which are "downloaded area" or "not". For this status, s3fs makes new temporary file in cache directory which is specified by "use_cache" option. This status files is in a directory which is named "<use_cache sirectory>/.<bucket_name>/". When s3fs opens this status file, s3fs locks this file for exclusive control by calling flock function. You need to take care about this, the status files can not be laid on network drive(like NFS). This revision changes about file open mode, s3fs always opens a local cache file and each status file with writable mode. Last, this revision adds new option "del_cache", this option means that s3fs deletes all local cache file when s3fs starts and exits. 3) Uploading When s3fs writes data to file descriptor through FUSE request, old s3fs revision downloads all of the object. But new revision does not download all, it downloads only small percial area(some block units) including writing data area. And when s3fs closes or flushes the file descriptor, s3fs downloads other area which is not downloaded from server. After that, s3fs uploads all of data. Already r456 revision has parallel upload function, then this revision with r456 and r457 are very big change for performance. 4) Downloading By changing a temporary file and a local cache file, when s3fs downloads a object, it downloads only the required range(some block units). And s3fs downloads units by parallel GET request, it is same as a case of uploading. (Maximum parallel request count and each download size are specified same parameters for uploading.) In the new revision, when s3fs opens file, s3fs returns file descriptor soon. Because s3fs only opens(makes) the file descriptor with no downloading data. And when s3fs reads a data, s3fs downloads only some block unit including specified area. This result is good for performance. 5) Changes option name The option "parallel_upload" which added at r456 is changed to new option name as "parallel_count". This reason is this option value is not only used by uploading object, but a uploading object also uses this option. (For a while, you can use old option name "parallel_upload" for compatibility.) git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
\fB\-o\fR fd_page_size(default="52428800"(50MB))
2015-07-10 18:50:40 +00:00
number of internal management page size for each file descriptor.
Changes codes for performance(part 3) * Summay This revision includes big change about temporary file and local cache file. By this big change, s3fs works with good performance when s3fs opens/ closes/syncs/reads object. I made a big change about the handling about temporary file and local cache file to do this implementation. * Detail 1) About temporary file(local file) s3fs uses a temporary file on local file system when s3fs does download/ upload/open/seek object on S3. After this revision, s3fs calls ftruncate() function when s3fs makes the temporary file. In this way s3fs can set a file size of precisely length without downloading. (Notice - ftruncate function is for XSI-compliant systems, so that possibly you have a problem on non-XSI-compliant systems.) By this change, s3fs can download a part of a object by requesting with "Range" http header. It seems like downloading by each block unit. The default block(part) size is 50MB, it is caused the result which is default parallel requests count(5) by default multipart upload size(10MB). If you need to change this block size, you can change by new option "fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes. So that, you have to take care about that fdcache.cpp(and fdcache.h) were changed a lot. 2) About local cache Local cache files which are in directory specified by "use_cache" option do not have always all of object data. This cause is that s3fs uses ftruncate function and reads(writes) each block unit of a temporary file. s3fs manages each block unit's status which are "downloaded area" or "not". For this status, s3fs makes new temporary file in cache directory which is specified by "use_cache" option. This status files is in a directory which is named "<use_cache sirectory>/.<bucket_name>/". When s3fs opens this status file, s3fs locks this file for exclusive control by calling flock function. You need to take care about this, the status files can not be laid on network drive(like NFS). This revision changes about file open mode, s3fs always opens a local cache file and each status file with writable mode. Last, this revision adds new option "del_cache", this option means that s3fs deletes all local cache file when s3fs starts and exits. 3) Uploading When s3fs writes data to file descriptor through FUSE request, old s3fs revision downloads all of the object. But new revision does not download all, it downloads only small percial area(some block units) including writing data area. And when s3fs closes or flushes the file descriptor, s3fs downloads other area which is not downloaded from server. After that, s3fs uploads all of data. Already r456 revision has parallel upload function, then this revision with r456 and r457 are very big change for performance. 4) Downloading By changing a temporary file and a local cache file, when s3fs downloads a object, it downloads only the required range(some block units). And s3fs downloads units by parallel GET request, it is same as a case of uploading. (Maximum parallel request count and each download size are specified same parameters for uploading.) In the new revision, when s3fs opens file, s3fs returns file descriptor soon. Because s3fs only opens(makes) the file descriptor with no downloading data. And when s3fs reads a data, s3fs downloads only some block unit including specified area. This result is good for performance. 5) Changes option name The option "parallel_upload" which added at r456 is changed to new option name as "parallel_count". This reason is this option value is not only used by uploading object, but a uploading object also uses this option. (For a while, you can use old option name "parallel_upload" for compatibility.) git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
For delayed reading and writing by s3fs, s3fs manages pages which is separated from object. Each pages has a status that data is already loaded(or not loaded yet).
This option should not be changed when you don't have a trouble with performance.
2014-03-30 07:53:41 +00:00
This value is changed automatically by parallel_count and multipart_size values(fd_page_size value = parallel_count * multipart_size).
.TP
\fB\-o\fR multipart_size(default="10"(10MB))
number of one part size in multipart uploading request.
The default size is 10MB(10485760byte), this value is minimum size.
Specify number of MB and over 10(MB).
This option is lated to fd_page_size option and affects it.
Changes codes for performance(part 3) * Summay This revision includes big change about temporary file and local cache file. By this big change, s3fs works with good performance when s3fs opens/ closes/syncs/reads object. I made a big change about the handling about temporary file and local cache file to do this implementation. * Detail 1) About temporary file(local file) s3fs uses a temporary file on local file system when s3fs does download/ upload/open/seek object on S3. After this revision, s3fs calls ftruncate() function when s3fs makes the temporary file. In this way s3fs can set a file size of precisely length without downloading. (Notice - ftruncate function is for XSI-compliant systems, so that possibly you have a problem on non-XSI-compliant systems.) By this change, s3fs can download a part of a object by requesting with "Range" http header. It seems like downloading by each block unit. The default block(part) size is 50MB, it is caused the result which is default parallel requests count(5) by default multipart upload size(10MB). If you need to change this block size, you can change by new option "fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes. So that, you have to take care about that fdcache.cpp(and fdcache.h) were changed a lot. 2) About local cache Local cache files which are in directory specified by "use_cache" option do not have always all of object data. This cause is that s3fs uses ftruncate function and reads(writes) each block unit of a temporary file. s3fs manages each block unit's status which are "downloaded area" or "not". For this status, s3fs makes new temporary file in cache directory which is specified by "use_cache" option. This status files is in a directory which is named "<use_cache sirectory>/.<bucket_name>/". When s3fs opens this status file, s3fs locks this file for exclusive control by calling flock function. You need to take care about this, the status files can not be laid on network drive(like NFS). This revision changes about file open mode, s3fs always opens a local cache file and each status file with writable mode. Last, this revision adds new option "del_cache", this option means that s3fs deletes all local cache file when s3fs starts and exits. 3) Uploading When s3fs writes data to file descriptor through FUSE request, old s3fs revision downloads all of the object. But new revision does not download all, it downloads only small percial area(some block units) including writing data area. And when s3fs closes or flushes the file descriptor, s3fs downloads other area which is not downloaded from server. After that, s3fs uploads all of data. Already r456 revision has parallel upload function, then this revision with r456 and r457 are very big change for performance. 4) Downloading By changing a temporary file and a local cache file, when s3fs downloads a object, it downloads only the required range(some block units). And s3fs downloads units by parallel GET request, it is same as a case of uploading. (Maximum parallel request count and each download size are specified same parameters for uploading.) In the new revision, when s3fs opens file, s3fs returns file descriptor soon. Because s3fs only opens(makes) the file descriptor with no downloading data. And when s3fs reads a data, s3fs downloads only some block unit including specified area. This result is good for performance. 5) Changes option name The option "parallel_upload" which added at r456 is changed to new option name as "parallel_count". This reason is this option value is not only used by uploading object, but a uploading object also uses this option. (For a while, you can use old option name "parallel_upload" for compatibility.) git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
.TP
\fB\-o\fR url (default="http://s3.amazonaws.com")
sets the url to use to access Amazon S3. If you want to use HTTPS, then you can set url=https://s3.amazonaws.com
.TP
\fB\-o\fR endpoint (default="us-east-1")
sets the endpoint to use.
If this option is not specified, s3fs uses \"us-east-1\" region as the default.
If the s3fs could not connect to the region specified by this option, s3fs could not run.
But if you do not specify this option, and if you can not connect with the default region, s3fs will retry to automatically connect to the other region.
So s3fs can know the correct region name, because s3fs can find it in an error from the S3 server.
.TP
\fB\-o\fR sigv2 (default is signature version 4)
sets signing AWS requests by sing Signature Version 2.
.TP
\fB\-o\fR mp_umask (default is "0000")
sets umask for the mount point directory.
If allow_other option is not set, s3fs allows access to the mount point only to the owner.
In the opposite case s3fs allows access to all users as the default.
But if you set the allow_other with this option, you can controll the permission permissions of the mount point by this option like umask.
.TP
\fB\-o\fR nomultipart - disable multipart uploads
.TP
\fB\-o\fR enable_content_md5 ( default is disable )
verifying uploaded data without multipart by content-md5 header.
Enable to send "Content-MD5" header when uploading a object without multipart posting.
If this option is enabled, it has some influences on a performance of s3fs when uploading small object.
Because s3fs always checks MD5 when uploading large object, this option does not affect on large object.
.TP
\fB\-o\fR iam_role ( default is no role )
set the IAM Role that will supply the credentials from the instance meta-data.
.TP
\fB\-o\fR noxmlns - disable registing xml name space.
disable registing xml name space for response of ListBucketResult and ListVersionsResult etc. Default name space is looked up from "http://s3.amazonaws.com/doc/2006-03-01".
This option should not be specified now, because s3fs looks up xmlns automatically after v1.66.
.TP
\fB\-o\fR nocopyapi - for other incomplete compatibility object storage.
For a distributed object storage which is compatibility S3 API without PUT(copy api).
If you set this option, s3fs do not use PUT with "x-amz-copy-source"(copy api). Because traffic is increased 2-3 times by this option, we do not recommend this.
Summary of Changes(1.63 -> 1.64) * This new version was made for fixing big issue about directory object. Please be careful and review new s3fs. ========================== List of Changes ========================== 1) Fixed bugs Fixed some memory leak and un-freed curl handle. Fixed codes with a bug which is not found yet. Fixed a bug that the s3fs could not update object's mtime when the s3fs had a opened file descriptor. Please let us know a bug, when you find new bug of a memory leak. 2) Changed codes Changed codes of s3fs_readdir() and list_bucket() etc. Changed codes so that the get_realpath() function returned std::string. Changed codes about exit() function. Because the exit() function is called from many fuse callback function directly, these function called fuse_exit() function and retuned with error. Changed codes so that the case of the characters for the "x-amz-meta" response header is ignored. 3) Added a option Added the norenameapi option for the storage compatible with S3 without copy API. This option is subset of nocopyapi option. Please read man page or call with --help option. 4) Object for directory This is very big and important change. The object of directory is changed "dir/" instead of "dir" for being compatible with other S3 client applications. And this version understands the object of directory which is made by old version. If the new s3fs changes the attributes or owner/group or mtime of the directory object, the s3fs automatically changes the object from old object name("dir") to new("dir/"). If you need to change old object name("dir") to new("dir/") manually, you can use shell script(mergedir.sh) in test directory. * About the directory object name AWS S3 allows the object name as both "dir" and "dir/". The s3fs before this version understood only "dir" as directory object name, but old version did not understand the "dir/" object name. The new version understands both of "dir" and "dir/" object name. The s3fs user needs to be care for the special situation that I mentioned later. The new version deletes old "dir" object and makes new "dir/" object, when the user operates the directory object for changing the permission or owner/group or mtime. This operation does on background and automatically. If you need to merge manually, you can use shell script which is mergedir.sh in test directory. This script runs chmod/chown/touch commands after finding a directory. Other S3 client application makes a directory object("dir/") without meta information which is needed to understand by the s3fs, this script can add meta information for a directory object. If this script function is insufficient for you, you can read and modify the codes by yourself. Please use the shell script carefully because of changing the object. If you find a bug in this script, please let me know. * Details ** The directory object made by old version The directory object made by old version is not understood by other S3 client application. New s3fs version was updated for keeping compatibility with other clients. You can use the mergedir.sh in test directory for merging from old directory object("dir") to new("dir/"). The directory object name is changed from "dir" to "dir/" after the mergedir.sh is run, this changed "dir/" object is understood by other S3 clients. This script runs chmod/chown/chgrp/touch/etc commands against the old directory object("dir"), then new s3fs merges that directory automatically. If you need to change directory object from old to new manually, you can do it by running these commands which change the directory attributes(mode/owner/group/mtime). ** The directory object made by new version The directory object name made by new version is "dir/". Because the name includes "/", other S3 client applications understand it as the directory. I tested new directory by s3cmd/tntDrive/DragonDisk/Gladinet as other S3 clients, the result was good compatibility. You need to know that the compatibility has small problem by the difference in specifications between clients. And you need to be careful about that the old s3fs can not understand the directory object which made by new s3fs. You should change all s3fs which accesses same bucket. ** The directory object made by other S3 client application Because the object is determined as a directory by the s3fs, the s3fs makes and uses special meta information which is "x-amz-meta-***" and "Content-Type" as HTTP header. The s3fs sets and uses HTTP headers for the directory object, those headers are listed below. Content-Type: application/x-directory x-amz-meta-mode: <mode> x-amz-meta-uid: <UID> x-amz-meta-gid <GID> x-amz-meta-mtime: <unix time of modified file> Other S3 client application builds the directory object without attributes which is needed by the s3fs. When the "ls" command is run on the s3fs-fuse file system which has directories/files made by other S3 clients, this result is shown below. d--------- 1 root root 0 Feb 27 11:21 dir ---------- 1 root root 1024 Mar 14 02:15 file Because the objects don't have meta information("x-amz-meta-mode"), it means mode=0000. In this case, the directory object is shown only "d", because the s3fs determines the object as a directory when the object is the name with "/" or has "Content-type: application/x-directory" header. (The s3fs sets "Content-Type: application/x-directory" to the directory object, but other S3 clients set "binary/octet-stream".) In that result, nobody without root is allowed to operate the object. The owner and group are "root"(UID=0) because the object doesn't have "x-amz-meta-uid/gid". If the object doesn't have "x-amz-meta-mtime", the s3fs uses "Last-Modified" HTTP header. Therefore the object's mtime is "Last-Modified" value.(This logic is same as old version) It has been already explained, if you need to change the object attributes, you can do it by manually operation or mergedir.sh. * Example of the compatibility with s3cmd etc ** Case A) Only "dir/file" object One of case, there is only "dir/file" object without "dir/" object, that object is made by s3cmd or etc. In this case, the response of REST API(list bucket) with "delimiter=/" parameter has "CommonPrefixes", and the "dir/" is listed in "CommonPrefixes/Prefix", but the "dir/" object is not real object. The s3fs needs to determine this object as directory, however there is no real directory object("dir" or "dir/"). But both new s3fs and old one does NOT understand this "dir/" in "CommonPrefixes", because the s3fs fails to get meta information from "dir" or "dir/". On this case, the result of "ls" command is shown below. ??????????? ? ? ? ? ? dir This "dir" is not operated by anyone and any process, because the s3fs does not understand this object permission. And "dir/file" object can not be shown and operated too. Some other S3 clients(tntDrive/Gladinet/etc) can not understand this object as same as the s3fs. If you need to operate "dir/file" object, you need to make the "dir/" object as a directory. To make the "dir/" directory object, you need to do below. Because there is already the "dir" object which is not real object, you can not make "dir/" directory. (s3cmd does not make "dir/" object because the object name has "/".). You should make another name directory(ex: "dir2/"), and move the "dir/file" objects to in new directory. Last, you can rename the directory name from "dir2/" to "dir/". ** Case B) Both "dir" and "dir/file" object This case is that there are "dir" and "dir/file" objects which were made by s3cmd/etc. s3cmd and s3fs understand the "dir" object as normal(file) object because this object does not have meta information and a name with "/". But the result of REST API(list bucket) has "dir/" name in "CommonPrefixes/Prefix". The s3fs checks "dir/" and "dir" as a directory, but the "dir" object is not directory object. (Because the new s3fs need to compatible old version, the s3fs checks a directory object in order of "dir/", "dir") In this case, the result of "ls" command is shown below. ---------- 1 root root 0 Feb 27 02:48 dir As a result, the "dir/file" can not be shown and operated because the "dir" object is a file. If you determine the "dir" as a directory, you need to add mete information to the "dir" object by s3cmd. ** Case C) Both "dir" and "dir/" object Last case is that there are "dir" and "dir/" objects which were made by other S3 clients. (example: At first you upload a object "dir/" as a directory by new 3sfs, and you upload a object "dir" by s3cmd.) New s3fs determines "dir/" as a directory, because the s3fs searches in oder of "dir/", "dir". As a result, the "dir" object can not be shown and operated. ** Compatibility between S3 clients Both new and old s3fs do not understand both "dir" and "dir/" at the same time, tntDrive and Galdinet are same as the s3fs. If there are "dir/" and "dir" objects, the s3fs gives priority to "dir/". But s3cmd and DragonDisk understand both objects. git-svn-id: http://s3fs.googlecode.com/svn/trunk@392 df820570-a93a-0410-bd06-b72b767a4274
2013-03-23 14:04:07 +00:00
.TP
\fB\-o\fR norenameapi - for other incomplete compatibility object storage.
For a distributed object storage which is compatibility S3 API without PUT(copy api).
This option is a subset of nocopyapi option. The nocopyapi option does not use copy-api for all command(ex. chmod, chown, touch, mv, etc), but this option does not use copy-api for only rename command(ex. mv).
If this option is specified with nocopapi, the s3fs ignores it.
.TP
\fB\-o\fR use_path_request_style (use legacy API calling style)
Enble compatibility with S3-like APIs which do not support the virtual-host request style, by using the older path request style.
.SH FUSE/MOUNT OPTIONS
.TP
2014-10-01 10:20:29 +00:00
Most of the generic mount options described in 'man mount' are supported (ro, rw, suid, nosuid, dev, nodev, exec, noexec, atime, noatime, sync async, dirsync). Filesystems are mounted with '\-onodev,nosuid' by default, which can only be overridden by a privileged user.
.TP
There are many FUSE specific mount options that can be specified. e.g. allow_other. See the FUSE README for the full set.
.SH NOTES
.TP
Maximum file size=64GB (limited by s3fs, not Amazon).
.TP
If enabled via the "use_cache" option, s3fs automatically maintains a local cache of files in the folder specified by use_cache. Whenever s3fs needs to read or write a file on S3, it first downloads the entire file locally to the folder specified by use_cache and operates on it. When fuse_release() is called, s3fs will re-upload the file to S3 if it has been changed. s3fs uses md5 checksums to minimize downloads from S3.
.TP
The folder specified by use_cache is just a local cache. It can be deleted at any time. s3fs rebuilds it on demand.
.TP
Local file caching works by calculating and comparing md5 checksums (ETag HTTP header).
.TP
s3fs leverages /etc/mime.types to "guess" the "correct" content-type based on file name extension. This means that you can copy a website to S3 and serve it up directly from S3 with correct content-types!
.SH BUGS
Due to S3's "eventual consistency" limitations, file creation can and will occasionally fail. Even after a successful create, subsequent reads can fail for an indeterminate time, even after one or more successful reads. Create and read enough files and you will eventually encounter this failure. This is not a flaw in s3fs and it is not something a FUSE wrapper like s3fs can work around. The retries option does not address this issue. Your application must either tolerate or compensate for these failures, for example by retrying creates or reads.
.SH AUTHOR
s3fs has been written by Randy Rizun <rrizun@gmail.com>.