2022-02-23 01:42:02 +00:00
.TH S3FS "1" "@MAN_PAGE_DATE@" "S3FS" "User Commands"
2011-02-11 20:57:44 +00:00
.SH NAME
S3FS \- FUSE-based file system backed by Amazon S3
.SH SYNOPSIS
.SS mounting
.TP
2013-05-09 04:25:18 +00:00
\fB s3fs bucket[:/path] mountpoint \fP [options]
2016-07-24 08:17:58 +00:00
.TP
2019-01-29 07:35:42 +00:00
\fB s3fs mountpoint \fP [options (must specify bucket= option)]
2011-02-11 20:57:44 +00:00
.SS unmounting
.TP
\fB umount mountpoint
2017-04-02 08:17:12 +00:00
For root.
.TP
\fB fusermount -u mountpoint
For unprivileged user.
2019-01-29 07:35:42 +00:00
.SS utility mode (remove interrupted multipart uploading objects)
2014-04-19 16:08:10 +00:00
.TP
2019-01-29 07:35:42 +00:00
\fB s3fs --incomplete-mpu-list (-u) bucket
2019-02-03 14:22:16 +00:00
.TP
\fB s3fs --incomplete-mpu-abort[=all | =<expire date format>] bucket
2011-02-11 20:57:44 +00:00
.SH DESCRIPTION
s3fs is a FUSE filesystem that allows you to mount an Amazon S3 bucket as a local filesystem. It stores files natively and transparently in S3 (i.e., you can use other programs to access the same files).
.SH AUTHENTICATION
2019-05-15 11:34:33 +00:00
s3fs supports the standard AWS credentials file (https://docs.aws.amazon.com/cli/latest/userguide/cli-config-files.html) stored in `${HOME}/.aws/credentials`.
2019-04-14 16:19:34 +00:00
Alternatively, s3fs supports a custom passwd file. Only AWS credentials file format can be used when AWS session token is required.
2011-02-11 20:57:44 +00:00
The s3fs password file has this format (use this format if you have only one set of credentials):
.RS 4
\fB accessKeyId\fP :\fB secretAccessKey\fP
.RE
If you have more than one set of credentials, this syntax is also recognized:
.RS 4
\fB bucketName\fP :\fB accessKeyId\fP :\fB secretAccessKey\fP
.RE
.PP
Password files can be stored in two locations:
.RS 4
2013-05-08 07:51:22 +00:00
\fB /etc/passwd-s3fs\fP [0640]
\fB $HOME/.passwd-s3fs\fP [0600]
2011-02-11 20:57:44 +00:00
.RE
2019-05-15 11:34:33 +00:00
.PP
2021-07-26 14:29:45 +00:00
s3fs also recognizes the \fB AWS_ACCESS_KEY_ID\fP and \fB AWS_SECRET_ACCESS_KEY\fP environment variables.
2011-02-11 20:57:44 +00:00
.SH OPTIONS
.SS "general options"
.TP
\fB \- h\fR \fB \- \- help\fR
print help
.TP
\fB \ \fR \fB \- \- version\fR
print version
.TP
\fB \- f\fR
FUSE foreground option - do not run as daemon.
.TP
\fB \- s\fR
2020-04-22 12:48:04 +00:00
FUSE single-threaded option (disables multi-threaded operation)
2011-02-11 20:57:44 +00:00
.SS "mount options"
.TP
All s3fs options must given in the form where "opt" is:
<option_name>=<option_value>
.TP
2016-07-24 08:17:58 +00:00
\fB \- o\fR bucket
2019-01-29 07:35:42 +00:00
if it is not specified bucket name (and path) in command line, must specify this option after \- o option for bucket name.
2016-07-24 08:17:58 +00:00
.TP
2011-02-11 20:57:44 +00:00
\fB \- o\fR default_acl (default="private")
2017-05-09 14:18:19 +00:00
the default canned acl to apply to all written s3 objects, e.g., "private", "public-read".
2022-02-21 09:54:26 +00:00
see https://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl for the full list of canned ACLs.
2011-02-11 20:57:44 +00:00
.TP
2018-07-08 03:49:10 +00:00
\fB \- o\fR retries (default="5")
2011-02-11 20:57:44 +00:00
number of times to retry a failed S3 transaction.
.TP
2021-06-13 18:45:56 +00:00
\fB \- o\fR tmpdir (default="/tmp")
local folder for temporary files.
.TP
2011-02-11 20:57:44 +00:00
\fB \- o\fR use_cache (default="" which means disabled)
local folder to use for local file cache.
.TP
2017-04-02 08:10:16 +00:00
\fB \- o\fR check_cache_dir_exist (default is disable)
If use_cache is set, check if the cache directory exists.
If this option is not specified, it will be created at runtime when the cache directory does not exist.
.TP
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
\fB \- o\fR del_cache - delete local file cache
delete local file cache when s3fs starts and exits.
.TP
2019-01-29 07:35:42 +00:00
\fB \- o\fR storage_class (default="standard")
2015-09-28 13:47:39 +00:00
store object with specified storage class.
2022-03-12 07:20:38 +00:00
Possible values: standard, standard_ia, onezone_ia, reduced_redundancy, intelligent_tiering, glacier, glacier_ir, and deep_archive.
2015-09-28 13:47:39 +00:00
.TP
2013-07-05 05:41:46 +00:00
\fB \- o\fR use_rrs (default is disable)
2011-02-11 20:57:44 +00:00
use Amazon's Reduced Redundancy Storage.
2013-07-05 05:41:46 +00:00
this option can not be specified with use_sse.
(can specify use_rrs=1 for old version)
2015-09-28 13:47:39 +00:00
this option has been replaced by new storage_class option.
2011-02-11 20:57:44 +00:00
.TP
2013-07-05 05:41:46 +00:00
\fB \- o\fR use_sse (default is disable)
2015-10-06 14:46:14 +00:00
Specify three type Amazon's Server-Site Encryption: SSE-S3, SSE-C or SSE-KMS. SSE-S3 uses Amazon S3-managed encryption keys, SSE-C uses customer-provided encryption keys, and SSE-KMS uses the master key which you manage in AWS KMS.
You can specify "use_sse" or "use_sse=1" enables SSE-S3 type (use_sse=1 is old type parameter).
2019-01-29 07:35:42 +00:00
Case of setting SSE-C, you can specify "use_sse=custom", "use_sse=custom:<custom key file path>" or "use_sse=<custom key file path>" (only <custom key file path> specified is old type parameter).
2015-10-06 14:46:14 +00:00
You can use "c" for short "custom".
The custom key file must be 600 permission. The file can have some lines, each line is one SSE-C key.
The first line in file is used as Customer-Provided Encryption Keys for uploading and changing headers etc.
If there are some keys after first line, those are used downloading object which are encrypted by not first key.
So that, you can keep all SSE-C keys in file, that is SSE-C key history.
2019-01-29 07:35:42 +00:00
If you specify "custom" ("c") without file path, you need to set custom key by load_sse_c option or AWSSSECKEYS environment. (AWSSSECKEYS environment has some SSE-C keys with ":" separator.)
2015-10-06 14:46:14 +00:00
This option is used to decide the SSE type.
2014-04-05 05:11:55 +00:00
So that if you do not want to encrypt a object at uploading, but you need to decrypt encrypted object at downloading, you can use load_sse_c option instead of this option.
2015-10-06 14:46:14 +00:00
For setting SSE-KMS, specify "use_sse=kmsid" or "use_sse=kmsid:<kms id>".
You can use "k" for short "kmsid".
2019-01-29 07:35:42 +00:00
If you san specify SSE-KMS type with your <kms id> in AWS KMS, you can set it after "kmsid:" (or "k:").
If you specify only "kmsid" ("k"), you need to set AWSSSEKMSID environment which value is <kms id>.
2015-10-06 14:46:14 +00:00
You must be careful about that you can not use the KMS id which is not same EC2 region.
2024-02-23 03:00:59 +00:00
Additionally, if you specify SSE-KMS, your endpoints must use Secure Sockets Layer(SSL) or Transport Layer Security(TLS).
2015-10-06 14:46:14 +00:00
.TP
\fB \- o\fR load_sse_c - specify SSE-C keys
2016-04-08 22:31:01 +00:00
Specify the custom-provided encryption keys file path for decrypting at downloading.
If you use the custom-provided encryption key at uploading, you specify with "use_sse=custom".
2015-10-06 14:46:14 +00:00
The file has many lines, one line means one custom key.
So that you can keep all SSE-C keys in file, that is SSE-C key history.
AWSSSECKEYS environment is as same as this file contents.
2013-06-04 06:04:04 +00:00
.TP
2011-02-11 20:57:44 +00:00
\fB \- o\fR passwd_file (default="")
specify the path to the password file, which which takes precedence over the password in $HOME/.passwd-s3fs and /etc/passwd-s3fs
.TP
Fixed Issue 229 and Changes codes
1) Set metadata "Content-Encoding" automatically(Issue 292)
For this issue, s3fs is added new option "ahbe_conf".
New option means the configuration file path, and this file specifies
additional HTTP header by file(object) extension.
Thus you can specify any HTTP header for each object by extension.
* ahbe_conf file format:
-----------
line = [file suffix] HTTP-header [HTTP-header-values]
file suffix = file(object) suffix, if this field is empty,
it means "*"(all object).
HTTP-header = additional HTTP header name
HTTP-header-values = additional HTTP header value
-----------
* Example:
-----------
.gz Content-Encoding gzip
.Z Content-Encoding compress
X-S3FS-MYHTTPHEAD myvalue
-----------
A sample configuration file is uploaded in "test" directory.
If ahbe_conf parameter is specified, s3fs loads it's configuration
and compares extension(suffix) of object(file) when uploading
(PUT/POST) it. If the extension is same, s3fs adds/sends specified
HTTP header and value.
A case of sample configuration file, if a object(it's extension is
".gz") which already has Content-Encoding HTTP header is renamed
to ".txt" extension, s3fs does not set Content-Encoding. Because
".txt" is not match any line in configuration file.
So, s3fs matches the extension by each PUT/POST action.
* Please take care about "Content-Encoding".
This new option allows setting ANY HTTP header by object extension.
For example, you can specify "Content-Encoding" for ".gz"/etc
extension in configuration. But this means that S3 always returns
"Content-Encoding: gzip" when a client requests with other
"Accept-Encoding:" header. It SHOULD NOT be good.
Please see RFC 2616.
2) Changes about allow_other/uid/gid option for mount point
I reviewed about mount point permission and allow_other/uid/gid
options, and found bugs about these.
s3fs is fixed bugs and changed to the following specifications.
* s3fs only allows uid(gid) options as 0(root), when the effective
user is zero(root).
* A mount point(directory) must have a permission to allow
accessing by effective user/group.
* If allow_other option is specified, the mount point permission
is set 0777(all users allow all access).
In another case, the mount point is set 0700(only allows
effective user).
* When uid/gid option is specified, the mount point owner/group
is set uid/gid option value.
If uid/gid is not set, it is set effective user/group id.
This changes maybe fixes some issue(321, 338).
3) Changes a logic about (Issue 229)
The chmod command returns -EIO when changing the mount point.
It is correct, s3fs can not changed owner/group/mtime for the
mount point, but s3fs sends a request for changing the bucket.
This revision does not send the request, and returns EIO as
soon as possible.
git-svn-id: http://s3fs.googlecode.com/svn/trunk@465 df820570-a93a-0410-bd06-b72b767a4274
2013-08-16 19:24:01 +00:00
\fB \- o\fR ahbe_conf (default="" which means disabled)
2019-01-29 07:35:42 +00:00
This option specifies the configuration file path which file is the additional HTTP header by file (object) extension.
Fixed Issue 229 and Changes codes
1) Set metadata "Content-Encoding" automatically(Issue 292)
For this issue, s3fs is added new option "ahbe_conf".
New option means the configuration file path, and this file specifies
additional HTTP header by file(object) extension.
Thus you can specify any HTTP header for each object by extension.
* ahbe_conf file format:
-----------
line = [file suffix] HTTP-header [HTTP-header-values]
file suffix = file(object) suffix, if this field is empty,
it means "*"(all object).
HTTP-header = additional HTTP header name
HTTP-header-values = additional HTTP header value
-----------
* Example:
-----------
.gz Content-Encoding gzip
.Z Content-Encoding compress
X-S3FS-MYHTTPHEAD myvalue
-----------
A sample configuration file is uploaded in "test" directory.
If ahbe_conf parameter is specified, s3fs loads it's configuration
and compares extension(suffix) of object(file) when uploading
(PUT/POST) it. If the extension is same, s3fs adds/sends specified
HTTP header and value.
A case of sample configuration file, if a object(it's extension is
".gz") which already has Content-Encoding HTTP header is renamed
to ".txt" extension, s3fs does not set Content-Encoding. Because
".txt" is not match any line in configuration file.
So, s3fs matches the extension by each PUT/POST action.
* Please take care about "Content-Encoding".
This new option allows setting ANY HTTP header by object extension.
For example, you can specify "Content-Encoding" for ".gz"/etc
extension in configuration. But this means that S3 always returns
"Content-Encoding: gzip" when a client requests with other
"Accept-Encoding:" header. It SHOULD NOT be good.
Please see RFC 2616.
2) Changes about allow_other/uid/gid option for mount point
I reviewed about mount point permission and allow_other/uid/gid
options, and found bugs about these.
s3fs is fixed bugs and changed to the following specifications.
* s3fs only allows uid(gid) options as 0(root), when the effective
user is zero(root).
* A mount point(directory) must have a permission to allow
accessing by effective user/group.
* If allow_other option is specified, the mount point permission
is set 0777(all users allow all access).
In another case, the mount point is set 0700(only allows
effective user).
* When uid/gid option is specified, the mount point owner/group
is set uid/gid option value.
If uid/gid is not set, it is set effective user/group id.
This changes maybe fixes some issue(321, 338).
3) Changes a logic about (Issue 229)
The chmod command returns -EIO when changing the mount point.
It is correct, s3fs can not changed owner/group/mtime for the
mount point, but s3fs sends a request for changing the bucket.
This revision does not send the request, and returns EIO as
soon as possible.
git-svn-id: http://s3fs.googlecode.com/svn/trunk@465 df820570-a93a-0410-bd06-b72b767a4274
2013-08-16 19:24:01 +00:00
The configuration file format is below:
-----------
2016-02-07 05:08:52 +00:00
line = [file suffix or regex] HTTP-header [HTTP-values]
2019-01-29 07:35:42 +00:00
file suffix = file (object) suffix, if this field is empty, it means "reg:(.*)".(=all object).
regex = regular expression to match the file (object) path. this type starts with "reg:" prefix.
Fixed Issue 229 and Changes codes
1) Set metadata "Content-Encoding" automatically(Issue 292)
For this issue, s3fs is added new option "ahbe_conf".
New option means the configuration file path, and this file specifies
additional HTTP header by file(object) extension.
Thus you can specify any HTTP header for each object by extension.
* ahbe_conf file format:
-----------
line = [file suffix] HTTP-header [HTTP-header-values]
file suffix = file(object) suffix, if this field is empty,
it means "*"(all object).
HTTP-header = additional HTTP header name
HTTP-header-values = additional HTTP header value
-----------
* Example:
-----------
.gz Content-Encoding gzip
.Z Content-Encoding compress
X-S3FS-MYHTTPHEAD myvalue
-----------
A sample configuration file is uploaded in "test" directory.
If ahbe_conf parameter is specified, s3fs loads it's configuration
and compares extension(suffix) of object(file) when uploading
(PUT/POST) it. If the extension is same, s3fs adds/sends specified
HTTP header and value.
A case of sample configuration file, if a object(it's extension is
".gz") which already has Content-Encoding HTTP header is renamed
to ".txt" extension, s3fs does not set Content-Encoding. Because
".txt" is not match any line in configuration file.
So, s3fs matches the extension by each PUT/POST action.
* Please take care about "Content-Encoding".
This new option allows setting ANY HTTP header by object extension.
For example, you can specify "Content-Encoding" for ".gz"/etc
extension in configuration. But this means that S3 always returns
"Content-Encoding: gzip" when a client requests with other
"Accept-Encoding:" header. It SHOULD NOT be good.
Please see RFC 2616.
2) Changes about allow_other/uid/gid option for mount point
I reviewed about mount point permission and allow_other/uid/gid
options, and found bugs about these.
s3fs is fixed bugs and changed to the following specifications.
* s3fs only allows uid(gid) options as 0(root), when the effective
user is zero(root).
* A mount point(directory) must have a permission to allow
accessing by effective user/group.
* If allow_other option is specified, the mount point permission
is set 0777(all users allow all access).
In another case, the mount point is set 0700(only allows
effective user).
* When uid/gid option is specified, the mount point owner/group
is set uid/gid option value.
If uid/gid is not set, it is set effective user/group id.
This changes maybe fixes some issue(321, 338).
3) Changes a logic about (Issue 229)
The chmod command returns -EIO when changing the mount point.
It is correct, s3fs can not changed owner/group/mtime for the
mount point, but s3fs sends a request for changing the bucket.
This revision does not send the request, and returns EIO as
soon as possible.
git-svn-id: http://s3fs.googlecode.com/svn/trunk@465 df820570-a93a-0410-bd06-b72b767a4274
2013-08-16 19:24:01 +00:00
HTTP-header = additional HTTP header name
HTTP-values = additional HTTP header value
-----------
Sample:
-----------
2016-02-07 05:08:52 +00:00
.gz Content-Encoding gzip
.Z Content-Encoding compress
reg:^/MYDIR/(.*)[.]t2$ Content-Encoding text2
Fixed Issue 229 and Changes codes
1) Set metadata "Content-Encoding" automatically(Issue 292)
For this issue, s3fs is added new option "ahbe_conf".
New option means the configuration file path, and this file specifies
additional HTTP header by file(object) extension.
Thus you can specify any HTTP header for each object by extension.
* ahbe_conf file format:
-----------
line = [file suffix] HTTP-header [HTTP-header-values]
file suffix = file(object) suffix, if this field is empty,
it means "*"(all object).
HTTP-header = additional HTTP header name
HTTP-header-values = additional HTTP header value
-----------
* Example:
-----------
.gz Content-Encoding gzip
.Z Content-Encoding compress
X-S3FS-MYHTTPHEAD myvalue
-----------
A sample configuration file is uploaded in "test" directory.
If ahbe_conf parameter is specified, s3fs loads it's configuration
and compares extension(suffix) of object(file) when uploading
(PUT/POST) it. If the extension is same, s3fs adds/sends specified
HTTP header and value.
A case of sample configuration file, if a object(it's extension is
".gz") which already has Content-Encoding HTTP header is renamed
to ".txt" extension, s3fs does not set Content-Encoding. Because
".txt" is not match any line in configuration file.
So, s3fs matches the extension by each PUT/POST action.
* Please take care about "Content-Encoding".
This new option allows setting ANY HTTP header by object extension.
For example, you can specify "Content-Encoding" for ".gz"/etc
extension in configuration. But this means that S3 always returns
"Content-Encoding: gzip" when a client requests with other
"Accept-Encoding:" header. It SHOULD NOT be good.
Please see RFC 2616.
2) Changes about allow_other/uid/gid option for mount point
I reviewed about mount point permission and allow_other/uid/gid
options, and found bugs about these.
s3fs is fixed bugs and changed to the following specifications.
* s3fs only allows uid(gid) options as 0(root), when the effective
user is zero(root).
* A mount point(directory) must have a permission to allow
accessing by effective user/group.
* If allow_other option is specified, the mount point permission
is set 0777(all users allow all access).
In another case, the mount point is set 0700(only allows
effective user).
* When uid/gid option is specified, the mount point owner/group
is set uid/gid option value.
If uid/gid is not set, it is set effective user/group id.
This changes maybe fixes some issue(321, 338).
3) Changes a logic about (Issue 229)
The chmod command returns -EIO when changing the mount point.
It is correct, s3fs can not changed owner/group/mtime for the
mount point, but s3fs sends a request for changing the bucket.
This revision does not send the request, and returns EIO as
soon as possible.
git-svn-id: http://s3fs.googlecode.com/svn/trunk@465 df820570-a93a-0410-bd06-b72b767a4274
2013-08-16 19:24:01 +00:00
-----------
A sample configuration file is uploaded in "test" directory.
If you specify this option for set "Content-Encoding" HTTP header, please take care for RFC 2616.
.TP
2018-11-04 19:41:49 +00:00
\fB \- o\fR profile (default="default")
Choose a profile from ${HOME}/.aws/credentials to authenticate against S3.
Note that this format matches the AWS CLI format and differs from the s3fs passwd format.
.TP
2011-02-11 20:57:44 +00:00
\fB \- o\fR public_bucket (default="" which means disabled)
anonymously mount a public bucket when set to 1, ignores the $HOME/.passwd-s3fs and /etc/passwd-s3fs files.
2017-05-13 07:35:55 +00:00
S3 does not allow copy object api for anonymous users, then s3fs sets nocopyapi option automatically when public_bucket=1 option is specified.
2011-02-11 20:57:44 +00:00
.TP
2015-04-12 02:04:13 +00:00
\fB \- o\fR connect_timeout (default="300" seconds)
2011-02-11 20:57:44 +00:00
time to wait for connection before giving up.
.TP
2019-07-17 04:52:08 +00:00
\fB \- o\fR readwrite_timeout (default="120" seconds)
2011-02-11 20:57:44 +00:00
time to wait between read/write activity before giving up.
.TP
2018-07-08 03:49:10 +00:00
\fB \- o\fR list_object_max_keys (default="1000")
specify the maximum number of keys returned by S3 list object API. The default is 1000. you can set this value to 1000 or more.
.TP
\fB \- o\fR max_stat_cache_size (default="100,000" entries (about 40MB))
2019-11-26 13:42:44 +00:00
maximum number of entries in the stat cache and symbolic link cache.
2011-02-12 16:48:23 +00:00
.TP
2020-07-25 12:37:28 +00:00
\fB \- o\fR stat_cache_expire (default is 900)
2019-11-26 13:42:44 +00:00
specify expire time (seconds) for entries in the stat cache and symbolic link cache. This expire time indicates the time since cached.
2017-03-19 15:19:04 +00:00
.TP
2020-07-25 12:37:28 +00:00
\fB \- o\fR stat_cache_interval_expire (default is 900)
2019-11-26 13:42:44 +00:00
specify expire time (seconds) for entries in the stat cache and symbolic link cache. This expire time is based on the time from the last access time of those cache.
2017-03-19 15:19:04 +00:00
This option is exclusive with stat_cache_expire, and is left for compatibility with older versions.
Summary of Changes(1.62 -> 1.63)
1) Lifetime for the stats cache
Added the new option "stat_cache_expire".
This option which is specified by seconds means the lifetime for each stats cache entry.
If this option is not specified, the stats cache is kept in s3fs process until the stats cache grown to maximum size. (default)
If this option is specified, the stats cache entry is out from the memory when the entry expires time.
2) Enable file permission
s3fs before 1.62 did not consider the file access permission.
s3fs after this version can consider it.
For access permission, the s3fs_getattr() function was divided into sub function which can check the file access permission.
It is like access() function.
And the function calling the s3fs_getattr() calls this new sub function instead of s3fs_getattr().
Last the s3fs_opendir() function which is called by FUSE was added for checking directory access permission when listing the files in directory.
3) UID/GUID
When a file or a directory was created, the s3fs could not set the UID/GID as the user who executed a command.
(Almost the UID/GID are root, because the s3fs run by root.)
After this version, the s3fs set correct UID/GID as the user who executes the commond.
4) About the mtime
If the object does not have "x-amz-meta-mtime" meta, the s3fs uses the "Last-Modified" header instead of it.
But the s3fs had a bug in this code, and this version fixed this bug.
When user modified the file, the s3fs did not update the mtime of the file.
This version fixed this bug.
In the get_local_fd() function, the local file's mtime was changed only when s3fs run with "use_cache" option.
This version always updates the mtime whether the local cache file is used or not.
And s3fs_flush ( ) function set the mtime of local cache file from S3 object mtime, but it was wrong .
This version is that the s3fs_flush ( ) changes the mtime of S3 object from the local cache file or the tmpfile .
The s3fs cuts some requests, because the s3fs can always check mtime whether the s3fs uses or does not use the local cache file.
5) A case of no "x-amz-meta-mode"
If the object did not have "x-amz-meta-mtime" mete, the s3fs recognized the file as not regular file.
After this version, the s3fs recognizes the file as regular file.
6) "." and ".." directory
The s3fs_readdir() did not return "X" and "XX" directory name.
After this version, the s3fs is changed that it returns "X" and "XX".
Example, the result of "ls" lists "X" and "XX" directory.
7) Fixed a bug
The insert_object() had a bug, and it is fixed.
git-svn-id: http://s3fs.googlecode.com/svn/trunk@390 df820570-a93a-0410-bd06-b72b767a4274
2013-02-24 08:58:54 +00:00
.TP
2022-03-12 07:57:31 +00:00
\fB \- o\fR disable_noobj_cache (default is enable)
By default s3fs memorizes when an object does not exist up until the stat cache timeout.
This caching can cause staleness for applications.
If disabled, s3fs will not memorize objects and may cause extra HeadObject requests and reduce performance.
2013-05-08 07:51:22 +00:00
.TP
2015-09-28 13:47:39 +00:00
\fB \- o\fR no_check_certificate (by default this option is disabled)
2015-05-20 15:32:36 +00:00
server certificate won't be checked against the available certificate authorities.
.TP
2019-07-02 17:24:00 +00:00
\fB \- o\fR ssl_verify_hostname (default="2")
When 0, do not verify the SSL certificate against the hostname.
.TP
2024-03-24 07:30:40 +00:00
\fB \- o\fR ssl_client_cert (default="")
Specify an SSL client certificate.
Specify this optional parameter in the following format:
"<SSL Cert>[:<Cert Type>[:<Private Key>[:<Key Type>
[:<Password>]]]]"
<SSL Cert>: Client certificate.
Specify the file path or NickName(for NSS, etc.).
<Cert Type>: Type of certificate, default is "PEM"(optional).
<Private Key>: Certificate's private key file(optional).
<Key Type>: Type of private key, default is "PEM"(optional).
<Password>: Passphrase of the private key(optional). It is also possible to omit this value and specify it using the environment variable "S3FS_SSL_PRIVKEY_PASSWORD".
.TP
2020-04-22 12:48:04 +00:00
\fB \- o\fR nodnscache - disable DNS cache.
s3fs is always using DNS cache, this option make DNS cache disable.
2013-05-22 08:49:23 +00:00
.TP
2020-04-22 12:48:04 +00:00
\fB \- o\fR nosscache - disable SSL session cache.
s3fs is always using SSL session cache, this option make SSL session cache disable.
2013-09-14 21:50:39 +00:00
.TP
\fB \- o\fR multireq_max (default="20")
2013-06-19 14:53:58 +00:00
maximum number of parallel request for listing objects.
.TP
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
\fB \- o\fR parallel_count (default="5")
2013-07-10 06:24:06 +00:00
number of parallel request for uploading big objects.
2022-11-22 08:58:48 +00:00
s3fs uploads large object (over 25MB by default) by multipart post request, and sends parallel requests.
2013-07-10 06:24:06 +00:00
This option limits parallel request count which s3fs requests at once.
It is necessary to set this value depending on a CPU and a network band.
2014-03-30 07:53:41 +00:00
.TP
2019-01-29 07:35:42 +00:00
\fB \- o\fR multipart_size (default="10")
part size, in MB, for each multipart request.
The minimum value is 5 MB and the maximum value is 5 GB.
2015-10-18 17:03:41 +00:00
.TP
2021-02-08 11:32:12 +00:00
\fB \- o\fR multipart_copy_size (default="512")
part size, in MB, for each multipart copy request, used for
renames and mixupload.
The minimum value is 5 MB and the maximum value is 5 GB.
Must be at least 512 MB to copy the maximum 5 TB object size
but lower values may improve performance.
.TP
2020-10-13 13:30:42 +00:00
\fB \- o\fR max_dirty_data (default="5120")
Flush dirty data to S3 after a certain number of MB written.
2022-02-21 09:54:26 +00:00
The minimum value is 50 MB. -1 value means disable.
2020-10-13 13:30:42 +00:00
Cannot be used with nomixupload.
.TP
2023-04-23 05:04:38 +00:00
\fB \- o\fR bucket_size (default=maximum long unsigned integer value)
The size of the bucket with which the corresponding
elements of the statvfs structure will be filled. The option
argument is an integer optionally followed by a
multiplicative suffix (GB, GiB, TB, TiB, PB, PiB,
EB, EiB) (no spaces in between). If no suffix is supplied,
bytes are assumed; eg: 20000000, 30GB, 45TiB. Note that
s3fs does not compute the actual volume size (too
expensive): by default it will assume the maximum possible
size; however, since this may confuse other software which
uses s3fs, the advertised bucket size can be set with this
option.
.TP
2019-01-29 07:35:42 +00:00
\fB \- o\fR ensure_diskfree (default 0)
2015-10-18 17:03:41 +00:00
sets MB to ensure disk free space. This option means the threshold of free space size on disk which is used for the cache file by s3fs.
2019-01-29 07:35:42 +00:00
s3fs makes file for downloading, uploading and caching files.
2022-02-21 09:54:26 +00:00
If the disk free space is smaller than this value, s3fs do not use disk space as possible in exchange for the performance.
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
.TP
2023-10-20 09:11:47 +00:00
\fB \- o\fR free_space_ratio (default="10")
sets min free space ratio of the disk. The value of this option can be between 0 and 100. It will control
the size of the cache according to this ratio to ensure that the idle ratio of the disk is greater than this value.
For example, when the disk space is 50GB, the default value will
ensure that the disk will reserve at least 50GB * 10%% = 5GB of remaining space.
.TP
2021-02-11 14:35:46 +00:00
\fB \- o\fR multipart_threshold (default="25")
threshold, in MB, to use multipart upload instead of
2022-02-21 09:54:26 +00:00
single-part. Must be at least 5 MB.
2021-02-11 14:35:46 +00:00
.TP
2019-06-16 00:33:26 +00:00
\fB \- o\fR singlepart_copy_limit (default="512")
2019-01-29 07:35:42 +00:00
maximum size, in MB, of a single-part copy before trying
multipart copy.
.TP
2020-02-02 10:09:25 +00:00
\fB \- o\fR host (default="https://s3.amazonaws.com")
Set a non-Amazon host, e.g., https://example.com.
.TP
2020-04-22 12:48:04 +00:00
\fB \- o\fR servicepath (default="/")
2020-02-02 10:09:25 +00:00
Set a service path when the non-Amazon host requires a prefix.
.TP
2017-05-13 06:47:51 +00:00
\fB \- o\fR url (default="https://s3.amazonaws.com")
sets the url to use to access Amazon S3. If you want to use HTTP, then you can set "url=http://s3.amazonaws.com".
If you do not use https, please specify the URL with the url option.
2011-06-27 02:21:38 +00:00
.TP
2015-01-28 17:13:11 +00:00
\fB \- o\fR endpoint (default="us-east-1")
2019-01-29 07:35:42 +00:00
sets the endpoint to use on signature version 4.
2015-10-06 14:46:14 +00:00
If this option is not specified, s3fs uses "us-east-1" region as the default.
2015-02-02 16:36:08 +00:00
If the s3fs could not connect to the region specified by this option, s3fs could not run.
But if you do not specify this option, and if you can not connect with the default region, s3fs will retry to automatically connect to the other region.
So s3fs can know the correct region name, because s3fs can find it in an error from the S3 server.
2015-01-28 17:13:11 +00:00
.TP
2020-10-01 09:50:49 +00:00
\fB \- o\fR sigv2 (default is signature version 4 falling back to version 2)
sets signing AWS requests by using only signature version 2.
.TP
\fB \- o\fR sigv4 (default is signature version 4 falling back to version 2)
sets signing AWS requests by using only signature version 4.
2015-01-28 17:13:11 +00:00
.TP
2015-02-07 17:16:45 +00:00
\fB \- o\fR mp_umask (default is "0000")
sets umask for the mount point directory.
If allow_other option is not set, s3fs allows access to the mount point only to the owner.
In the opposite case s3fs allows access to all users as the default.
2019-01-29 07:35:42 +00:00
But if you set the allow_other with this option, you can control the permissions of the mount point by this option like umask.
2015-02-07 17:16:45 +00:00
.TP
2019-09-05 03:53:58 +00:00
\fB \- o\fR umask (default is "0000")
2022-02-21 09:54:26 +00:00
sets umask for files under the mountpoint. This can allow
2019-09-05 03:53:58 +00:00
users other than the mounting user to read and write to files
that they did not create.
.TP
2011-06-27 02:21:38 +00:00
\fB \- o\fR nomultipart - disable multipart uploads
2013-01-19 16:05:07 +00:00
.TP
2023-01-29 11:19:29 +00:00
\fB \- o\fR streamupload (default is disable)
Enable stream upload.
If this option is enabled, a sequential upload will be performed in parallel with the write from the part that has been written during a multipart upload.
This is expected to give better performance than other upload functions.
Note that this option is still experimental and may change in the future.
.TP
\fB \- o\fR max_thread_count (default is "5")
Specifies the number of threads waiting for stream uploads.
Note that this option and Streamm Upload are still experimental and subject to change in the future.
This option will be merged with "parallel_count" in the future.
.TP
2019-01-29 07:35:42 +00:00
\fB \- o\fR enable_content_md5 (default is disable)
2019-01-25 22:22:23 +00:00
Allow S3 server to check data integrity of uploads via the Content-MD5 header.
This can add CPU overhead to transfers.
2022-07-03 03:35:02 +00:00
.TP
2021-11-01 14:33:55 +00:00
\fB \- o\fR enable_unsigned_payload (default is disable)
2022-01-12 13:50:49 +00:00
Do not calculate Content-SHA256 for PutObject and UploadPart
2022-02-21 09:54:26 +00:00
payloads. This can reduce CPU overhead to transfers.
2013-05-16 02:02:55 +00:00
.TP
2019-01-29 07:35:42 +00:00
\fB \- o\fR ecs (default is disable)
2017-11-23 12:21:56 +00:00
This option instructs s3fs to query the ECS container credential metadata address instead of the instance metadata address.
.TP
2019-01-29 07:35:42 +00:00
\fB \- o\fR iam_role (default is no IAM role)
2016-05-06 04:37:32 +00:00
This option requires the IAM role name or "auto". If you specify "auto", s3fs will automatically use the IAM role names that are set to an instance. If you specify this option without any argument, it is the same as that you have specified the "auto".
2013-10-06 13:45:32 +00:00
.TP
2020-10-30 16:59:55 +00:00
\fB \- o\fR imdsv1only (default is to use IMDSv2 with fallback to v1)
AWS instance metadata service, used with IAM role authentication,
2022-02-21 09:54:26 +00:00
supports the use of an API token. If you're using an IAM role in an
2020-10-30 16:59:55 +00:00
environment that does not support IMDSv2, setting this flag will skip
retrieval and usage of the API token when retrieving IAM credentials.
2021-07-26 14:29:45 +00:00
.TP
2019-01-29 07:35:42 +00:00
\fB \- o\fR ibm_iam_auth (default is not using IBM IAM authentication)
2017-11-23 08:46:24 +00:00
This option instructs s3fs to use IBM IAM authentication. In this mode, the AWSAccessKey and AWSSecretKey will be used as IBM's Service-Instance-ID and APIKey, respectively.
.TP
2021-04-21 05:26:26 +00:00
\fB \- o\fR ibm_iam_endpoint (default is https://iam.cloud.ibm.com)
2019-01-29 07:35:42 +00:00
Sets the URL to use for IBM IAM authentication.
2018-05-31 10:32:48 +00:00
.TP
2022-10-22 01:42:07 +00:00
\fB \- o\fR credlib (default=\"\" which means disabled)
Specifies the shared library that handles the credentials containing the authentication token.
If this option is specified, the specified credential and token processing provided by the shared library ant will be performed instead of the built-in credential processing.
This option cannot be specified with passwd_file, profile, use_session_token, ecs, ibm_iam_auth, ibm_iam_endpoint, imdsv1only and iam_role option.
.TP
\fB \- o\fR credlib_opts (default=\"\" which means disabled)
Specifies the options to pass when the shared library specified in credlib is loaded and then initialized.
For the string specified in this option, specify the string defined by the shared library.
.TP
2019-01-29 07:35:42 +00:00
\fB \- o\fR use_xattr (default is not handling the extended attribute)
Enable to handle the extended attribute (xattrs).
2016-09-19 04:28:01 +00:00
If you set this option, you can use the extended attribute.
For example, encfs and ecryptfs need to support the extended attribute.
Notice: if s3fs handles the extended attribute, s3fs can not work to copy command with preserve=mode.
.TP
2014-04-05 05:11:55 +00:00
\fB \- o\fR noxmlns - disable registering xml name space.
disable registering xml name space for response of ListBucketResult and ListVersionsResult etc. Default name space is looked up from "http://s3.amazonaws.com/doc/2006-03-01".
2013-04-06 17:39:22 +00:00
This option should not be specified now, because s3fs looks up xmlns automatically after v1.66.
2013-01-19 16:05:07 +00:00
.TP
2019-09-26 02:30:58 +00:00
\fB \- o\fR nomixupload - disable copy in multipart uploads.
Disable to use PUT (copy api) when multipart uploading large size objects.
By default, when doing multipart upload, the range of unchanged data will use PUT (copy api) whenever possible.
When nocopyapi or norenameapi is specified, use of PUT (copy api) is invalidated even if this option is not specified.
.TP
2013-01-19 16:05:07 +00:00
\fB \- o\fR nocopyapi - for other incomplete compatibility object storage.
2019-01-29 07:35:42 +00:00
For a distributed object storage which is compatibility S3 API without PUT (copy api).
If you set this option, s3fs do not use PUT with "x-amz-copy-source" (copy api). Because traffic is increased 2-3 times by this option, we do not recommend this.
Summary of Changes(1.63 -> 1.64)
* This new version was made for fixing big issue about directory object.
Please be careful and review new s3fs.
==========================
List of Changes
==========================
1) Fixed bugs
Fixed some memory leak and un-freed curl handle.
Fixed codes with a bug which is not found yet.
Fixed a bug that the s3fs could not update object's mtime when the s3fs had a opened file descriptor.
Please let us know a bug, when you find new bug of a memory leak.
2) Changed codes
Changed codes of s3fs_readdir() and list_bucket() etc.
Changed codes so that the get_realpath() function returned std::string.
Changed codes about exit() function. Because the exit() function is called from many fuse callback function directly, these function called fuse_exit() function and retuned with error.
Changed codes so that the case of the characters for the "x-amz-meta" response header is ignored.
3) Added a option
Added the norenameapi option for the storage compatible with S3 without copy API.
This option is subset of nocopyapi option.
Please read man page or call with --help option.
4) Object for directory
This is very big and important change.
The object of directory is changed "dir/" instead of "dir" for being compatible with other S3 client applications.
And this version understands the object of directory which is made by old version.
If the new s3fs changes the attributes or owner/group or mtime of the directory object, the s3fs automatically changes the object from old object name("dir") to new("dir/").
If you need to change old object name("dir") to new("dir/") manually, you can use shell script(mergedir.sh) in test directory.
* About the directory object name
AWS S3 allows the object name as both "dir" and "dir/".
The s3fs before this version understood only "dir" as directory object name, but old version did not understand the "dir/" object name.
The new version understands both of "dir" and "dir/" object name.
The s3fs user needs to be care for the special situation that I mentioned later.
The new version deletes old "dir" object and makes new "dir/" object, when the user operates the directory object for changing the permission or owner/group or mtime.
This operation does on background and automatically.
If you need to merge manually, you can use shell script which is mergedir.sh in test directory.
This script runs chmod/chown/touch commands after finding a directory.
Other S3 client application makes a directory object("dir/") without meta information which is needed to understand by the s3fs, this script can add meta information for a directory object.
If this script function is insufficient for you, you can read and modify the codes by yourself.
Please use the shell script carefully because of changing the object.
If you find a bug in this script, please let me know.
* Details
** The directory object made by old version
The directory object made by old version is not understood by other S3 client application.
New s3fs version was updated for keeping compatibility with other clients.
You can use the mergedir.sh in test directory for merging from old directory object("dir") to new("dir/").
The directory object name is changed from "dir" to "dir/" after the mergedir.sh is run, this changed "dir/" object is understood by other S3 clients.
This script runs chmod/chown/chgrp/touch/etc commands against the old directory object("dir"), then new s3fs merges that directory automatically.
If you need to change directory object from old to new manually, you can do it by running these commands which change the directory attributes(mode/owner/group/mtime).
** The directory object made by new version
The directory object name made by new version is "dir/".
Because the name includes "/", other S3 client applications understand it as the directory.
I tested new directory by s3cmd/tntDrive/DragonDisk/Gladinet as other S3 clients, the result was good compatibility.
You need to know that the compatibility has small problem by the difference in specifications between clients.
And you need to be careful about that the old s3fs can not understand the directory object which made by new s3fs.
You should change all s3fs which accesses same bucket.
** The directory object made by other S3 client application
Because the object is determined as a directory by the s3fs, the s3fs makes and uses special meta information which is "x-amz-meta-***" and "Content-Type" as HTTP header.
The s3fs sets and uses HTTP headers for the directory object, those headers are listed below.
Content-Type: application/x-directory
x-amz-meta-mode: <mode>
x-amz-meta-uid: <UID>
x-amz-meta-gid <GID>
x-amz-meta-mtime: <unix time of modified file>
Other S3 client application builds the directory object without attributes which is needed by the s3fs.
When the "ls" command is run on the s3fs-fuse file system which has directories/files made by other S3 clients, this result is shown below.
d--------- 1 root root 0 Feb 27 11:21 dir
---------- 1 root root 1024 Mar 14 02:15 file
Because the objects don't have meta information("x-amz-meta-mode"), it means mode=0000.
In this case, the directory object is shown only "d", because the s3fs determines the object as a directory when the object is the name with "/" or has "Content-type: application/x-directory" header.
(The s3fs sets "Content-Type: application/x-directory" to the directory object, but other S3 clients set "binary/octet-stream".)
In that result, nobody without root is allowed to operate the object.
The owner and group are "root"(UID=0) because the object doesn't have "x-amz-meta-uid/gid".
If the object doesn't have "x-amz-meta-mtime", the s3fs uses "Last-Modified" HTTP header.
Therefore the object's mtime is "Last-Modified" value.(This logic is same as old version)
It has been already explained, if you need to change the object attributes, you can do it by manually operation or mergedir.sh.
* Example of the compatibility with s3cmd etc
** Case A) Only "dir/file" object
One of case, there is only "dir/file" object without "dir/" object, that object is made by s3cmd or etc.
In this case, the response of REST API(list bucket) with "delimiter=/" parameter has "CommonPrefixes", and the "dir/" is listed in "CommonPrefixes/Prefix", but the "dir/" object is not real object.
The s3fs needs to determine this object as directory, however there is no real directory object("dir" or "dir/").
But both new s3fs and old one does NOT understand this "dir/" in "CommonPrefixes", because the s3fs fails to get meta information from "dir" or "dir/".
On this case, the result of "ls" command is shown below.
??????????? ? ? ? ? ? dir
This "dir" is not operated by anyone and any process, because the s3fs does not understand this object permission.
And "dir/file" object can not be shown and operated too.
Some other S3 clients(tntDrive/Gladinet/etc) can not understand this object as same as the s3fs.
If you need to operate "dir/file" object, you need to make the "dir/" object as a directory.
To make the "dir/" directory object, you need to do below.
Because there is already the "dir" object which is not real object, you can not make "dir/" directory.
(s3cmd does not make "dir/" object because the object name has "/".).
You should make another name directory(ex: "dir2/"), and move the "dir/file" objects to in new directory.
Last, you can rename the directory name from "dir2/" to "dir/".
** Case B) Both "dir" and "dir/file" object
This case is that there are "dir" and "dir/file" objects which were made by s3cmd/etc.
s3cmd and s3fs understand the "dir" object as normal(file) object because this object does not have meta information and a name with "/".
But the result of REST API(list bucket) has "dir/" name in "CommonPrefixes/Prefix".
The s3fs checks "dir/" and "dir" as a directory, but the "dir" object is not directory object.
(Because the new s3fs need to compatible old version, the s3fs checks a directory object in order of "dir/", "dir")
In this case, the result of "ls" command is shown below.
---------- 1 root root 0 Feb 27 02:48 dir
As a result, the "dir/file" can not be shown and operated because the "dir" object is a file.
If you determine the "dir" as a directory, you need to add mete information to the "dir" object by s3cmd.
** Case C) Both "dir" and "dir/" object
Last case is that there are "dir" and "dir/" objects which were made by other S3 clients.
(example: At first you upload a object "dir/" as a directory by new 3sfs, and you upload a object "dir" by s3cmd.)
New s3fs determines "dir/" as a directory, because the s3fs searches in oder of "dir/", "dir".
As a result, the "dir" object can not be shown and operated.
** Compatibility between S3 clients
Both new and old s3fs do not understand both "dir" and "dir/" at the same time, tntDrive and Galdinet are same as the s3fs.
If there are "dir/" and "dir" objects, the s3fs gives priority to "dir/".
But s3cmd and DragonDisk understand both objects.
git-svn-id: http://s3fs.googlecode.com/svn/trunk@392 df820570-a93a-0410-bd06-b72b767a4274
2013-03-23 14:04:07 +00:00
.TP
\fB \- o\fR norenameapi - for other incomplete compatibility object storage.
2019-01-29 07:35:42 +00:00
For a distributed object storage which is compatibility S3 API without PUT (copy api).
This option is a subset of nocopyapi option. The nocopyapi option does not use copy-api for all command (ex. chmod, chown, touch, mv, etc), but this option does not use copy-api for only rename command (ex. mv).
2014-04-05 05:11:55 +00:00
If this option is specified with nocopyapi, then s3fs ignores it.
2014-06-03 14:45:39 +00:00
.TP
\fB \- o\fR use_path_request_style (use legacy API calling style)
2014-04-05 05:11:55 +00:00
Enable compatibility with S3-like APIs which do not support the virtual-host request style, by using the older path request style.
2015-09-30 19:41:27 +00:00
.TP
2021-02-23 00:45:13 +00:00
\fB \- o\fR listobjectsv2 (use ListObjectsV2)
Issue ListObjectsV2 instead of ListObjects, useful on object
stores without ListObjects support.
.TP
2016-04-17 07:44:03 +00:00
\fB \- o\fR noua (suppress User-Agent header)
Usually s3fs outputs of the User-Agent in "s3fs/<version> (commit hash <hash>; <using ssl library name>)" format.
If this option is specified, s3fs suppresses the output of the User-Agent.
.TP
2017-04-04 12:32:53 +00:00
\fB \- o\fR cipher_suites
2019-01-29 07:35:42 +00:00
Customize the list of TLS cipher suites. Expects a colon separated list of cipher suite names.
2017-04-04 12:32:53 +00:00
A list of available cipher suites, depending on your TLS engine, can be found on the CURL library documentation:
https://curl.haxx.se/docs/ssl-ciphers.html
.TP
2018-02-28 07:51:35 +00:00
\fB \- o\fR instance_name
The instance name of the current s3fs mountpoint.
This name will be added to logging messages and user agent headers sent by s3fs.
.TP
2017-05-04 03:41:24 +00:00
\fB \- o\fR complement_stat (complement lack of file/directory mode)
s3fs complements lack of information about file/directory mode if a file or a directory object does not have x-amz-meta-mode header.
As default, s3fs does not complements stat information for a object, then the object will not be able to be allowed to list/modify.
.TP
2022-06-27 22:56:06 +00:00
\fB \- o\fR compat_dir (enable support of alternative directory names)
2022-02-19 16:24:38 +00:00
.RS
2022-06-27 22:56:06 +00:00
s3fs supports two different naming schemas "dir/" and "dir" to map directory names to S3 objects and vice versa by default. As a third variant, directories can be determined indirectly if there is a file object with a path (e.g. "/dir/file") but without the parent directory. This option enables a fourth variant, "dir_$folder$", created by older applications.
2022-02-19 16:24:38 +00:00
.TP
S3fs uses only the first schema "dir/" to create S3 objects for directories.
.TP
The support for these different naming schemas causes an increased communication effort.
2023-05-07 00:15:27 +00:00
.TP
If you do not have access permissions to the bucket and specify a directory path created by a client other than s3fs for the mount point, you cannot start because the mount point directory cannot be found by s3fs. But by specifying this option, you can avoid this error.
2022-02-19 16:24:38 +00:00
.RE
2017-05-05 17:28:29 +00:00
.TP
2019-03-10 04:32:10 +00:00
\fB \- o\fR use_wtf8 - support arbitrary file system encoding.
2020-04-22 12:48:04 +00:00
S3 requires all object names to be valid UTF-8. But some
2019-03-10 04:32:10 +00:00
clients, notably Windows NFS clients, use their own encoding.
2020-04-22 12:48:04 +00:00
This option re-encodes invalid UTF-8 object names into valid
UTF-8 by mapping offending codes into a 'private' codepage of the
2019-03-10 04:32:10 +00:00
Unicode set.
2020-04-22 12:48:04 +00:00
Useful on clients not using UTF-8 as their file system encoding.
2019-03-10 04:32:10 +00:00
.TP
2019-07-05 17:25:10 +00:00
\fB \- o\fR use_session_token - indicate that session token should be provided.
If credentials are provided by environment variables this switch
2021-07-26 14:29:45 +00:00
forces presence check of AWS_SESSION_TOKEN variable.
2019-07-05 17:25:10 +00:00
Otherwise an error is returned.
.TP
2019-11-25 13:53:50 +00:00
\fB \- o\fR requester_pays (default is disable)
This option instructs s3fs to enable requests involving Requester Pays buckets (It includes the 'x-amz-request-payer=requester' entry in the request header).
.TP
2020-03-19 15:13:21 +00:00
\fB \- o\fR mime (default is "/etc/mime.types")
Specify the path of the mime.types file.
If this option is not specified, the existence of "/etc/mime.types" is checked, and that file is loaded as mime information.
If this file does not exist on macOS, then "/etc/apache2/mime.types" is checked as well.
.TP
2022-12-04 16:09:09 +00:00
\fB \- o\fR proxy (default="")
This option specifies a proxy to S3 server.
Specify the proxy with '[<scheme://]hostname(fqdn)[:<port>]' formatted.
'<schema>://' can be omitted, and 'http://' is used when omitted.
Also, ':<port>' can also be omitted. If omitted, port 443 is used for HTTPS schema, and port 1080 is used otherwise.
This option is the same as the curl command's '--proxy(-x)' option and libcurl's 'CURLOPT_PROXY' flag.
This option is equivalent to and takes precedence over the environment variables 'http_proxy', 'all_proxy', etc.
.TP
\fB \- o\fR proxy_cred_file (default="")
This option specifies the file that describes the username and passphrase for authentication of the proxy when the HTTP schema proxy is specified by the 'proxy' option.
Username and passphrase are valid only for HTTP schema.
If the HTTP proxy does not require authentication, this option is not required.
Separate the username and passphrase with a ':' character and specify each as a URL-encoded string.
.TP
2024-03-10 04:30:29 +00:00
\fB \- o\fR ipresolve (default="whatever")
Select what type of IP addresses to use when establishing a connection.
Default('whatever') can use addresses of all IP versions(IPv4 and IPv6) that your system allows.
If you specify 'IPv4', only IPv4 addresses are used.
And when 'IPv6' is specified, only IPv6 addresses will be used.
.TP
2020-10-13 14:00:11 +00:00
\fB \- o\fR logfile - specify the log output file.
s3fs outputs the log file to syslog. Alternatively, if s3fs is started with the "-f" option specified, the log will be output to the stdout/stderr.
You can use this option to specify the log file that s3fs outputs.
If you specify a log file with this option, it will reopen the log file when s3fs receives a SIGHUP signal. You can use the SIGHUP signal for log rotation.
.TP
2015-09-30 19:41:27 +00:00
\fB \- o\fR dbglevel (default="crit")
2019-01-29 07:35:42 +00:00
Set the debug message level. set value as crit (critical), err (error), warn (warning), info (information) to debug level. default debug level is critical.
2015-09-30 19:41:27 +00:00
If s3fs run with "-d" option, the debug level is set information.
2021-05-01 06:09:29 +00:00
When s3fs catch the signal SIGUSR2, the debug level is bump up.
2015-09-30 19:41:27 +00:00
.TP
\fB \- o\fR curldbg - put curl debug message
Put the debug message from libcurl when this option is specified.
2020-05-24 07:23:27 +00:00
Specify "normal" or "body" for the parameter.
If the parameter is omitted, it is the same as "normal".
If "body" is specified, some API communication body data will be output in addition to the debug message output as "normal".
2020-06-28 08:00:41 +00:00
.TP
2021-02-13 08:05:32 +00:00
\fB \- o\fR no_time_stamp_msg - no time stamp in debug message
The time stamp is output to the debug message by default.
If this option is specified, the time stamp will not be output in the debug message.
It is the same even if the environment variable "S3FS_MSGTIMESTAMP" is set to "no".
.TP
2020-06-28 08:00:41 +00:00
\fB \- o\fR set_check_cache_sigusr1 (default is stdout)
If the cache is enabled, you can check the integrity of the cache file and the cache file's stats info file.
This option is specified and when sending the SIGUSR1 signal to the s3fs process checks the cache status at that time.
This option can take a file path as parameter to output the check result to that file.
The file path parameter can be omitted. If omitted, the result will be output to stdout or syslog.
2023-02-12 08:59:40 +00:00
.TP
\fB \- o\fR update_parent_dir_stat (default is disable)
The parent directory's mtime and ctime are updated when a file or directory is created or deleted (when the parent directory's inode is updated).
By default, parent directory statistics are not updated.
2019-02-03 14:22:16 +00:00
.SS "utility mode options"
.TP
\fB \- u\fR or \fB \- \- incomplete\- mpu\- list\fR
Lists multipart incomplete objects uploaded to the specified bucket.
.TP
2019-01-29 07:35:42 +00:00
\fB \- \- incomplete\- mpu\- abort\fR all or date format (default="24H")
2019-02-03 14:22:16 +00:00
Delete the multipart incomplete object uploaded to the specified bucket.
If "all" is specified for this option, all multipart incomplete objects will be deleted.
2019-01-29 07:35:42 +00:00
If you specify no argument as an option, objects older than 24 hours (24H) will be deleted (This is the default value).
2019-02-03 14:22:16 +00:00
You can specify an optional date format.
It can be specified as year, month, day, hour, minute, second, and it is expressed as "Y", "M", "D", "h", "m", "s" respectively.
For example, "1Y6M10D12h30m30s".
2011-02-11 20:57:44 +00:00
.SH FUSE/MOUNT OPTIONS
.TP
2022-11-22 08:58:48 +00:00
Most of the generic mount options described in 'man mount' are supported (ro, rw, suid, nosuid, dev, nodev, exec, noexec, atime, noatime, sync, async, dirsync). Filesystems are mounted with '\- onodev,nosuid' by default, which can only be overridden by a privileged user.
2011-02-11 20:57:44 +00:00
.TP
There are many FUSE specific mount options that can be specified. e.g. allow_other. See the FUSE README for the full set.
2023-01-29 11:19:29 +00:00
.SH SERVER URL/REQUEST STYLE
Be careful when specifying the server endpoint(URL).
.TP
If your bucket name contains dots("."), you should use the path request style(using "use_path_request_style" option).
.TP
Also, if you are using a server other than Amazon S3, you need to specify the endpoint with the "url" option. At that time, depending on the server you are using, you may have to specify the path request style("use_path_request_style" option).
2021-05-01 06:09:29 +00:00
.SH LOCAL STORAGE CONSUMPTION
2011-02-11 20:57:44 +00:00
.TP
2021-05-01 06:09:29 +00:00
s3fs requires local caching for operation. You can enable a local cache with "\- o use_cache" or s3fs uses temporary files to cache pending requests to s3.
.TP
Apart from the requirements discussed below, it is recommended to keep enough cache resp. temporary storage to allow one copy each of all files open for reading and writing at any one time.
.TP
.SS Local cache with \[ dq]\-o use_cache\[dq]
.TP
s3fs automatically maintains a local cache of files. The cache folder is specified by the parameter of "\- o use_cache". It is only a local cache that can be deleted at any time. s3fs rebuilds it if necessary.
2011-02-11 20:57:44 +00:00
.TP
2021-05-01 06:09:29 +00:00
Whenever s3fs needs to read or write a file on S3, it first creates the file in the cache directory and operates on it.
2011-02-11 20:57:44 +00:00
.TP
2021-05-01 06:09:29 +00:00
The amount of local cache storage used can be indirectly controlled with "\- o ensure_diskfree".
2011-02-11 20:57:44 +00:00
.TP
2021-05-01 06:09:29 +00:00
.SS Without local cache
.TP
Since s3fs always requires some storage space for operation, it creates temporary files to store incoming write requests until the required s3 request size is reached and the segment has been uploaded. After that, this data is truncated in the temporary file to free up storage space.
.TP
Per file you need at least twice the part size (default 5MB or "-o multipart_size") for writing multipart requests or space for the whole file if single requests are enabled ("\- o nomultipart").
2022-02-19 15:56:45 +00:00
.SH PERFORMANCE CONSIDERATIONS
.TP
This section discusses settings to improve s3fs performance.
.TP
In most cases, backend performance cannot be controlled and is therefore not part of this discussion.
.TP
Details of the local storage usage is discussed in "LOCAL STORAGE CONSUMPTION".
.TP
.SS CPU and Memory Consumption
.TP
s3fs is a multi-threaded application. Depending on the workload it may use multiple CPUs and a certain amount of memory. You can monitor the CPU and memory consumption with the "top" utility.
.TP
.SS Performance of S3 requests
.TP
s3fs provides several options (e.g. "\- o multipart_size", "\- o parallel_count") to control behaviour and thus indirectly the performance. The possible combinations of these options in conjunction with the various S3 backends are so varied that there is no individual recommendation other than the default values. Improved individual settings can be found by testing and measuring.
.TP
The two options "Enable no object cache" ("\- o enable_noobj_cache") and "Disable support of alternative directory names" ("\- o notsup_compat_dir") can be used to control shared access to the same bucket by different applications:
.TP
.IP \[ bu]
Enable no object cache ("\- o enable_noobj_cache")
.RS
.TP
If a bucket is used exclusively by an s3fs instance, you can enable the cache for non-existent files and directories with "\- o enable_noobj_cache". This eliminates repeated requests to check the existence of an object, saving time and possibly money.
.RE
.IP \[ bu]
2023-01-29 11:19:29 +00:00
Enable support of alternative directory names ("\- o compat_dir")
2022-02-19 15:56:45 +00:00
.RS
.TP
2023-01-29 11:19:29 +00:00
s3fs recognizes "dir/" objects as directories. Clients other than s3fs may use "dir", "dir_$folder$" objects as directories, or directory objects may not exist. In order for s3fs to recognize these as directories, you can specify the "compat_dir" option.
.RE
.IP \[ bu]
Completion of file and directory information ("\- o complement_stat")
.RS
2022-02-19 15:56:45 +00:00
.TP
2023-01-29 11:19:29 +00:00
s3fs uses the "x-amz-meta-mode header" to determine if an object is a file or a directory. For this reason, objects that do not have the "x-amz-meta-mode header" may not produce the expected results(The directory cannot be displayed, etc.). By specifying the "complement_stat" option, s3fs can automatically complete this missing attribute information, and you can get the expected results.
2022-02-19 15:56:45 +00:00
.RE
2021-05-01 06:09:29 +00:00
.SH NOTES
.TP
The maximum size of objects that s3fs can handle depends on Amazon S3. For example, up to 5 GB when using single PUT API. And up to 5 TB is supported when Multipart Upload API is used.
2011-02-11 20:57:44 +00:00
.TP
s3fs leverages /etc/mime.types to "guess" the "correct" content-type based on file name extension. This means that you can copy a website to S3 and serve it up directly from S3 with correct content-types!
2019-03-28 23:59:19 +00:00
.SH SEE ALSO
2019-03-29 14:14:42 +00:00
fuse(8), mount(8), fusermount(1), fstab(5)
2011-02-11 20:57:44 +00:00
.SH BUGS
Due to S3's "eventual consistency" limitations, file creation can and will occasionally fail. Even after a successful create, subsequent reads can fail for an indeterminate time, even after one or more successful reads. Create and read enough files and you will eventually encounter this failure. This is not a flaw in s3fs and it is not something a FUSE wrapper like s3fs can work around. The retries option does not address this issue. Your application must either tolerate or compensate for these failures, for example by retrying creates or reads.
.SH AUTHOR
s3fs has been written by Randy Rizun <rrizun@gmail.com>.