2011-03-01 19:35:55 +00:00
/*
* s3fs - FUSE - based file system backed by Amazon S3
*
2017-05-07 11:24:17 +00:00
* Copyright ( C ) 2007 Randy Rizun < rrizun @ gmail . com >
2011-03-01 19:35:55 +00:00
*
* This program is free software ; you can redistribute it and / or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation ; either version 2
* of the License , or ( at your option ) any later version .
*
* This program is distributed in the hope that it will be useful ,
* but WITHOUT ANY WARRANTY ; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE . See the
* GNU General Public License for more details .
*
* You should have received a copy of the GNU General Public License
* along with this program ; if not , write to the Free Software
* Foundation , Inc . , 51 Franklin Street , Fifth Floor , Boston , MA 02110 - 1301 , USA .
*/
2019-07-12 10:33:53 +00:00
# include <cstdio>
# include <cstdlib>
# include <ctime>
2011-03-01 19:35:55 +00:00
# include <unistd.h>
# include <fstream>
2017-11-18 18:10:29 +00:00
# include <sstream>
Summary of Changes(1.63 -> 1.64)
* This new version was made for fixing big issue about directory object.
Please be careful and review new s3fs.
==========================
List of Changes
==========================
1) Fixed bugs
Fixed some memory leak and un-freed curl handle.
Fixed codes with a bug which is not found yet.
Fixed a bug that the s3fs could not update object's mtime when the s3fs had a opened file descriptor.
Please let us know a bug, when you find new bug of a memory leak.
2) Changed codes
Changed codes of s3fs_readdir() and list_bucket() etc.
Changed codes so that the get_realpath() function returned std::string.
Changed codes about exit() function. Because the exit() function is called from many fuse callback function directly, these function called fuse_exit() function and retuned with error.
Changed codes so that the case of the characters for the "x-amz-meta" response header is ignored.
3) Added a option
Added the norenameapi option for the storage compatible with S3 without copy API.
This option is subset of nocopyapi option.
Please read man page or call with --help option.
4) Object for directory
This is very big and important change.
The object of directory is changed "dir/" instead of "dir" for being compatible with other S3 client applications.
And this version understands the object of directory which is made by old version.
If the new s3fs changes the attributes or owner/group or mtime of the directory object, the s3fs automatically changes the object from old object name("dir") to new("dir/").
If you need to change old object name("dir") to new("dir/") manually, you can use shell script(mergedir.sh) in test directory.
* About the directory object name
AWS S3 allows the object name as both "dir" and "dir/".
The s3fs before this version understood only "dir" as directory object name, but old version did not understand the "dir/" object name.
The new version understands both of "dir" and "dir/" object name.
The s3fs user needs to be care for the special situation that I mentioned later.
The new version deletes old "dir" object and makes new "dir/" object, when the user operates the directory object for changing the permission or owner/group or mtime.
This operation does on background and automatically.
If you need to merge manually, you can use shell script which is mergedir.sh in test directory.
This script runs chmod/chown/touch commands after finding a directory.
Other S3 client application makes a directory object("dir/") without meta information which is needed to understand by the s3fs, this script can add meta information for a directory object.
If this script function is insufficient for you, you can read and modify the codes by yourself.
Please use the shell script carefully because of changing the object.
If you find a bug in this script, please let me know.
* Details
** The directory object made by old version
The directory object made by old version is not understood by other S3 client application.
New s3fs version was updated for keeping compatibility with other clients.
You can use the mergedir.sh in test directory for merging from old directory object("dir") to new("dir/").
The directory object name is changed from "dir" to "dir/" after the mergedir.sh is run, this changed "dir/" object is understood by other S3 clients.
This script runs chmod/chown/chgrp/touch/etc commands against the old directory object("dir"), then new s3fs merges that directory automatically.
If you need to change directory object from old to new manually, you can do it by running these commands which change the directory attributes(mode/owner/group/mtime).
** The directory object made by new version
The directory object name made by new version is "dir/".
Because the name includes "/", other S3 client applications understand it as the directory.
I tested new directory by s3cmd/tntDrive/DragonDisk/Gladinet as other S3 clients, the result was good compatibility.
You need to know that the compatibility has small problem by the difference in specifications between clients.
And you need to be careful about that the old s3fs can not understand the directory object which made by new s3fs.
You should change all s3fs which accesses same bucket.
** The directory object made by other S3 client application
Because the object is determined as a directory by the s3fs, the s3fs makes and uses special meta information which is "x-amz-meta-***" and "Content-Type" as HTTP header.
The s3fs sets and uses HTTP headers for the directory object, those headers are listed below.
Content-Type: application/x-directory
x-amz-meta-mode: <mode>
x-amz-meta-uid: <UID>
x-amz-meta-gid <GID>
x-amz-meta-mtime: <unix time of modified file>
Other S3 client application builds the directory object without attributes which is needed by the s3fs.
When the "ls" command is run on the s3fs-fuse file system which has directories/files made by other S3 clients, this result is shown below.
d--------- 1 root root 0 Feb 27 11:21 dir
---------- 1 root root 1024 Mar 14 02:15 file
Because the objects don't have meta information("x-amz-meta-mode"), it means mode=0000.
In this case, the directory object is shown only "d", because the s3fs determines the object as a directory when the object is the name with "/" or has "Content-type: application/x-directory" header.
(The s3fs sets "Content-Type: application/x-directory" to the directory object, but other S3 clients set "binary/octet-stream".)
In that result, nobody without root is allowed to operate the object.
The owner and group are "root"(UID=0) because the object doesn't have "x-amz-meta-uid/gid".
If the object doesn't have "x-amz-meta-mtime", the s3fs uses "Last-Modified" HTTP header.
Therefore the object's mtime is "Last-Modified" value.(This logic is same as old version)
It has been already explained, if you need to change the object attributes, you can do it by manually operation or mergedir.sh.
* Example of the compatibility with s3cmd etc
** Case A) Only "dir/file" object
One of case, there is only "dir/file" object without "dir/" object, that object is made by s3cmd or etc.
In this case, the response of REST API(list bucket) with "delimiter=/" parameter has "CommonPrefixes", and the "dir/" is listed in "CommonPrefixes/Prefix", but the "dir/" object is not real object.
The s3fs needs to determine this object as directory, however there is no real directory object("dir" or "dir/").
But both new s3fs and old one does NOT understand this "dir/" in "CommonPrefixes", because the s3fs fails to get meta information from "dir" or "dir/".
On this case, the result of "ls" command is shown below.
??????????? ? ? ? ? ? dir
This "dir" is not operated by anyone and any process, because the s3fs does not understand this object permission.
And "dir/file" object can not be shown and operated too.
Some other S3 clients(tntDrive/Gladinet/etc) can not understand this object as same as the s3fs.
If you need to operate "dir/file" object, you need to make the "dir/" object as a directory.
To make the "dir/" directory object, you need to do below.
Because there is already the "dir" object which is not real object, you can not make "dir/" directory.
(s3cmd does not make "dir/" object because the object name has "/".).
You should make another name directory(ex: "dir2/"), and move the "dir/file" objects to in new directory.
Last, you can rename the directory name from "dir2/" to "dir/".
** Case B) Both "dir" and "dir/file" object
This case is that there are "dir" and "dir/file" objects which were made by s3cmd/etc.
s3cmd and s3fs understand the "dir" object as normal(file) object because this object does not have meta information and a name with "/".
But the result of REST API(list bucket) has "dir/" name in "CommonPrefixes/Prefix".
The s3fs checks "dir/" and "dir" as a directory, but the "dir" object is not directory object.
(Because the new s3fs need to compatible old version, the s3fs checks a directory object in order of "dir/", "dir")
In this case, the result of "ls" command is shown below.
---------- 1 root root 0 Feb 27 02:48 dir
As a result, the "dir/file" can not be shown and operated because the "dir" object is a file.
If you determine the "dir" as a directory, you need to add mete information to the "dir" object by s3cmd.
** Case C) Both "dir" and "dir/" object
Last case is that there are "dir" and "dir/" objects which were made by other S3 clients.
(example: At first you upload a object "dir/" as a directory by new 3sfs, and you upload a object "dir" by s3cmd.)
New s3fs determines "dir/" as a directory, because the s3fs searches in oder of "dir/", "dir".
As a result, the "dir" object can not be shown and operated.
** Compatibility between S3 clients
Both new and old s3fs do not understand both "dir" and "dir/" at the same time, tntDrive and Galdinet are same as the s3fs.
If there are "dir/" and "dir" objects, the s3fs gives priority to "dir/".
But s3cmd and DragonDisk understand both objects.
git-svn-id: http://s3fs.googlecode.com/svn/trunk@392 df820570-a93a-0410-bd06-b72b767a4274
2013-03-23 14:04:07 +00:00
# include <algorithm>
2011-03-01 19:35:55 +00:00
2013-03-30 13:37:14 +00:00
# include "common.h"
2013-04-11 01:49:00 +00:00
# include "s3fs.h"
2020-08-22 12:40:53 +00:00
# include "curl.h"
# include "curl_multi.h"
# include "curl_util.h"
2014-05-06 14:23:05 +00:00
# include "s3fs_auth.h"
2020-08-22 12:40:53 +00:00
# include "autolock.h"
# include "s3fs_util.h"
# include "string_util.h"
2016-02-07 05:41:56 +00:00
# include "addhead.h"
2011-03-01 19:35:55 +00:00
2016-04-22 06:57:31 +00:00
//-------------------------------------------------------------------
2020-08-22 12:40:53 +00:00
// Symbols
2016-04-22 06:57:31 +00:00
//-------------------------------------------------------------------
2021-08-02 15:10:27 +00:00
static const char EMPTY_PAYLOAD_HASH [ ] = " e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 " ;
static const char EMPTY_MD5_BASE64_HASH [ ] = " 1B2M2Y8AsgTpgAmY7PhCfg== " ;
2016-04-22 06:57:31 +00:00
2013-03-30 13:37:14 +00:00
//-------------------------------------------------------------------
2013-07-05 02:28:31 +00:00
// Class S3fsCurl
//-------------------------------------------------------------------
2020-08-22 12:40:53 +00:00
static const int MULTIPART_SIZE = 10 * 1024 * 1024 ;
2021-06-30 00:25:36 +00:00
static const int GET_OBJECT_RESPONSE_LIMIT = 1024 ;
2020-08-22 12:40:53 +00:00
static const int IAM_EXPIRE_MERGIN = 20 * 60 ; // update timing
2021-08-02 15:10:27 +00:00
static const char ECS_IAM_ENV_VAR [ ] = " AWS_CONTAINER_CREDENTIALS_RELATIVE_URI " ;
static const char IAMCRED_ACCESSKEYID [ ] = " AccessKeyId " ;
static const char IAMCRED_SECRETACCESSKEY [ ] = " SecretAccessKey " ;
static const char IAMCRED_ROLEARN [ ] = " RoleArn " ;
2019-08-22 15:22:57 +00:00
2020-03-19 15:13:21 +00:00
// [NOTE] about default mime.types file
// If no mime.types file is specified in the mime option, s3fs
// will look for /etc/mime.types on all operating systems and
// load mime information.
// However, in the case of macOS, when this file does not exist,
// it tries to detect the /etc/apache2/mime.types file.
// The reason for this is that apache2 is preinstalled on macOS,
// and the mime.types file is expected to exist in this path.
// If the mime.types file is not found, s3fs will exit with an
// error.
//
2021-05-06 10:40:35 +00:00
static const char DEFAULT_MIME_FILE [ ] = " /etc/mime.types " ;
static const char SPECIAL_DARWIN_MIME_FILE [ ] = " /etc/apache2/mime.types " ;
2020-03-19 15:13:21 +00:00
2015-03-21 07:04:20 +00:00
// [NOTICE]
// This symbol is for libcurl under 7.23.0
# ifndef CURLSHE_NOT_BUILT_IN
2020-08-22 12:40:53 +00:00
# define CURLSHE_NOT_BUILT_IN 5
2015-03-21 07:04:20 +00:00
# endif
2020-08-22 12:40:53 +00:00
//-------------------------------------------------------------------
// Class S3fsCurl
//-------------------------------------------------------------------
const long S3fsCurl : : S3FSCURL_RESPONSECODE_NOTSET ;
const long S3fsCurl : : S3FSCURL_RESPONSECODE_FATAL_ERROR ;
const int S3fsCurl : : S3FSCURL_PERFORM_RESULT_NOTSET ;
2020-10-03 12:09:35 +00:00
pthread_mutex_t S3fsCurl : : curl_warnings_lock ;
2013-08-27 08:12:01 +00:00
pthread_mutex_t S3fsCurl : : curl_handles_lock ;
2020-08-18 12:37:02 +00:00
S3fsCurl : : callback_locks_t S3fsCurl : : callback_locks ;
2013-08-27 08:12:01 +00:00
bool S3fsCurl : : is_initglobal_done = false ;
2016-04-22 06:57:31 +00:00
CurlHandlerPool * S3fsCurl : : sCurlPool = NULL ;
int S3fsCurl : : sCurlPoolSize = 32 ;
2013-08-27 08:12:01 +00:00
CURLSH * S3fsCurl : : hCurlShare = NULL ;
2015-05-20 15:32:36 +00:00
bool S3fsCurl : : is_cert_check = true ; // default
2013-08-27 08:12:01 +00:00
bool S3fsCurl : : is_dns_cache = true ; // default
2013-09-14 21:50:39 +00:00
bool S3fsCurl : : is_ssl_session_cache = true ; // default
2015-04-12 02:04:13 +00:00
long S3fsCurl : : connect_timeout = 300 ; // default
2019-07-17 04:52:08 +00:00
time_t S3fsCurl : : readwrite_timeout = 120 ; // default
2018-07-08 03:49:10 +00:00
int S3fsCurl : : retries = 5 ; // default
2013-08-27 08:12:01 +00:00
bool S3fsCurl : : is_public_bucket = false ;
2020-08-20 02:09:49 +00:00
acl_t S3fsCurl : : default_acl = acl_t : : PRIVATE ;
2021-05-21 14:34:31 +00:00
std : : string S3fsCurl : : storage_class = " STANDARD " ;
2014-07-19 19:02:55 +00:00
sseckeylist_t S3fsCurl : : sseckeys ;
2019-01-23 07:15:19 +00:00
std : : string S3fsCurl : : ssekmsid ;
2020-08-20 02:09:49 +00:00
sse_type_t S3fsCurl : : ssetype = sse_type_t : : SSE_DISABLE ;
2013-08-27 08:12:01 +00:00
bool S3fsCurl : : is_content_md5 = false ;
bool S3fsCurl : : is_verbose = false ;
2020-05-24 07:23:27 +00:00
bool S3fsCurl : : is_dump_body = false ;
2020-09-11 09:37:24 +00:00
std : : string S3fsCurl : : AWSAccessKeyId ;
std : : string S3fsCurl : : AWSSecretAccessKey ;
std : : string S3fsCurl : : AWSAccessToken ;
2013-10-06 13:45:32 +00:00
time_t S3fsCurl : : AWSAccessTokenExpire = 0 ;
2017-11-05 19:24:02 +00:00
bool S3fsCurl : : is_ecs = false ;
2017-11-23 08:46:24 +00:00
bool S3fsCurl : : is_ibm_iam_auth = false ;
2020-09-11 09:37:24 +00:00
std : : string S3fsCurl : : IAM_cred_url = " http://169.254.169.254/latest/meta-data/iam/security-credentials/ " ;
2020-10-15 17:18:19 +00:00
std : : string S3fsCurl : : IAMv2_token_url = " http://169.254.169.254/latest/api/token " ;
std : : string S3fsCurl : : IAMv2_token_ttl_hdr = " X-aws-ec2-metadata-token-ttl-seconds " ;
std : : string S3fsCurl : : IAMv2_token_hdr = " X-aws-ec2-metadata-token " ;
int S3fsCurl : : IAMv2_token_ttl = 21600 ;
2017-11-23 08:46:24 +00:00
size_t S3fsCurl : : IAM_field_count = 4 ;
2020-09-11 09:37:24 +00:00
std : : string S3fsCurl : : IAM_token_field = " Token " ;
std : : string S3fsCurl : : IAM_expiry_field = " Expiration " ;
std : : string S3fsCurl : : IAM_role ;
2020-10-15 17:18:19 +00:00
std : : string S3fsCurl : : IAMv2_api_token ;
int S3fsCurl : : IAM_api_version = 2 ;
2013-08-27 08:12:01 +00:00
long S3fsCurl : : ssl_verify_hostname = 1 ; // default(original code...)
2020-08-17 00:10:49 +00:00
2020-10-03 12:09:35 +00:00
// protected by curl_warnings_lock
bool S3fsCurl : : curl_warnings_once = false ;
2020-08-17 00:10:49 +00:00
// protected by curl_handles_lock
2013-08-27 08:12:01 +00:00
curltime_t S3fsCurl : : curl_times ;
curlprogress_t S3fsCurl : : curl_progress ;
2020-08-17 00:10:49 +00:00
2020-09-11 09:37:24 +00:00
std : : string S3fsCurl : : curl_ca_bundle ;
2013-08-27 08:12:01 +00:00
mimes_t S3fsCurl : : mimeTypes ;
2020-09-11 09:37:24 +00:00
std : : string S3fsCurl : : userAgent ;
2014-03-30 07:53:41 +00:00
int S3fsCurl : : max_parallel_cnt = 5 ; // default
2019-01-19 01:16:56 +00:00
int S3fsCurl : : max_multireq = 20 ; // default
2014-03-30 07:53:41 +00:00
off_t S3fsCurl : : multipart_size = MULTIPART_SIZE ; // default
2021-02-08 11:32:12 +00:00
off_t S3fsCurl : : multipart_copy_size = 512 * 1024 * 1024 ; // default
2020-10-01 09:50:49 +00:00
signature_type_t S3fsCurl : : signature_type = V2_OR_V4 ; // default
2021-11-01 14:33:55 +00:00
bool S3fsCurl : : is_unsigned_payload = false ; // default
2016-04-17 07:44:03 +00:00
bool S3fsCurl : : is_ua = true ; // default
2021-02-23 00:45:13 +00:00
bool S3fsCurl : : listobjectsv2 = false ; // default
2020-08-22 12:40:53 +00:00
bool S3fsCurl : : is_use_session_token = false ; // default
2019-11-18 11:38:16 +00:00
bool S3fsCurl : : requester_pays = false ; // default
2013-07-05 02:28:31 +00:00
//-------------------------------------------------------------------
// Class methods for S3fsCurl
2013-03-30 13:37:14 +00:00
//-------------------------------------------------------------------
2020-03-19 15:13:21 +00:00
bool S3fsCurl : : InitS3fsCurl ( )
2013-03-30 13:37:14 +00:00
{
2020-08-22 12:40:53 +00:00
pthread_mutexattr_t attr ;
pthread_mutexattr_init ( & attr ) ;
2019-07-15 01:20:51 +00:00
# if S3FS_PTHREAD_ERRORCHECK
2020-08-22 12:40:53 +00:00
pthread_mutexattr_settype ( & attr , PTHREAD_MUTEX_ERRORCHECK ) ;
2019-07-15 01:20:51 +00:00
# endif
2020-10-03 12:09:35 +00:00
if ( 0 ! = pthread_mutex_init ( & S3fsCurl : : curl_warnings_lock , & attr ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
if ( 0 ! = pthread_mutex_init ( & S3fsCurl : : curl_handles_lock , & attr ) ) {
return false ;
}
if ( 0 ! = pthread_mutex_init ( & S3fsCurl : : callback_locks . dns , & attr ) ) {
return false ;
}
if ( 0 ! = pthread_mutex_init ( & S3fsCurl : : callback_locks . ssl_session , & attr ) ) {
return false ;
}
if ( ! S3fsCurl : : InitGlobalCurl ( ) ) {
return false ;
}
if ( ! S3fsCurl : : InitShareCurl ( ) ) {
return false ;
}
if ( ! S3fsCurl : : InitCryptMutex ( ) ) {
return false ;
}
// [NOTE]
// sCurlPoolSize must be over parallel(or multireq) count.
//
if ( sCurlPoolSize < std : : max ( GetMaxParallelCount ( ) , GetMaxMultiRequest ( ) ) ) {
sCurlPoolSize = std : : max ( GetMaxParallelCount ( ) , GetMaxMultiRequest ( ) ) ;
}
sCurlPool = new CurlHandlerPool ( sCurlPoolSize ) ;
if ( ! sCurlPool - > Init ( ) ) {
return false ;
}
return true ;
2013-03-30 13:37:14 +00:00
}
2011-08-31 22:20:20 +00:00
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : DestroyS3fsCurl ( )
2013-03-30 13:37:14 +00:00
{
2020-08-22 12:40:53 +00:00
bool result = true ;
if ( ! S3fsCurl : : DestroyCryptMutex ( ) ) {
result = false ;
}
if ( ! sCurlPool - > Destroy ( ) ) {
result = false ;
}
delete sCurlPool ;
sCurlPool = NULL ;
if ( ! S3fsCurl : : DestroyShareCurl ( ) ) {
result = false ;
}
if ( ! S3fsCurl : : DestroyGlobalCurl ( ) ) {
result = false ;
}
if ( 0 ! = pthread_mutex_destroy ( & S3fsCurl : : callback_locks . dns ) ) {
result = false ;
}
if ( 0 ! = pthread_mutex_destroy ( & S3fsCurl : : callback_locks . ssl_session ) ) {
result = false ;
}
if ( 0 ! = pthread_mutex_destroy ( & S3fsCurl : : curl_handles_lock ) ) {
result = false ;
}
2020-10-03 12:09:35 +00:00
if ( 0 ! = pthread_mutex_destroy ( & S3fsCurl : : curl_warnings_lock ) ) {
result = false ;
}
2020-08-22 12:40:53 +00:00
return result ;
2013-03-30 13:37:14 +00:00
}
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : InitGlobalCurl ( )
2013-06-01 15:31:31 +00:00
{
2020-08-22 12:40:53 +00:00
if ( S3fsCurl : : is_initglobal_done ) {
return false ;
}
if ( CURLE_OK ! = curl_global_init ( CURL_GLOBAL_ALL ) ) {
S3FS_PRN_ERR ( " init_curl_global_all returns error. " ) ;
return false ;
}
S3fsCurl : : is_initglobal_done = true ;
return true ;
2013-06-01 15:31:31 +00:00
}
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : DestroyGlobalCurl ( )
2013-06-01 15:31:31 +00:00
{
2020-08-22 12:40:53 +00:00
if ( ! S3fsCurl : : is_initglobal_done ) {
return false ;
}
curl_global_cleanup ( ) ;
S3fsCurl : : is_initglobal_done = false ;
return true ;
2013-07-05 02:28:31 +00:00
}
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : InitShareCurl ( )
2013-07-05 02:28:31 +00:00
{
2020-08-22 12:40:53 +00:00
CURLSHcode nSHCode ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
if ( ! S3fsCurl : : is_dns_cache & & ! S3fsCurl : : is_ssl_session_cache ) {
S3FS_PRN_INFO ( " Curl does not share DNS data. " ) ;
return true ;
2013-09-14 21:50:39 +00:00
}
2020-08-22 12:40:53 +00:00
if ( S3fsCurl : : hCurlShare ) {
S3FS_PRN_WARN ( " already initiated. " ) ;
return false ;
}
if ( NULL = = ( S3fsCurl : : hCurlShare = curl_share_init ( ) ) ) {
S3FS_PRN_ERR ( " curl_share_init failed " ) ;
return false ;
}
if ( CURLSHE_OK ! = ( nSHCode = curl_share_setopt ( S3fsCurl : : hCurlShare , CURLSHOPT_LOCKFUNC , S3fsCurl : : LockCurlShare ) ) ) {
S3FS_PRN_ERR ( " curl_share_setopt(LOCKFUNC) returns %d(%s) " , nSHCode , curl_share_strerror ( nSHCode ) ) ;
return false ;
}
if ( CURLSHE_OK ! = ( nSHCode = curl_share_setopt ( S3fsCurl : : hCurlShare , CURLSHOPT_UNLOCKFUNC , S3fsCurl : : UnlockCurlShare ) ) ) {
S3FS_PRN_ERR ( " curl_share_setopt(UNLOCKFUNC) returns %d(%s) " , nSHCode , curl_share_strerror ( nSHCode ) ) ;
return false ;
}
if ( S3fsCurl : : is_dns_cache ) {
nSHCode = curl_share_setopt ( S3fsCurl : : hCurlShare , CURLSHOPT_SHARE , CURL_LOCK_DATA_DNS ) ;
if ( CURLSHE_OK ! = nSHCode & & CURLSHE_BAD_OPTION ! = nSHCode & & CURLSHE_NOT_BUILT_IN ! = nSHCode ) {
S3FS_PRN_ERR ( " curl_share_setopt(DNS) returns %d(%s) " , nSHCode , curl_share_strerror ( nSHCode ) ) ;
return false ;
} else if ( CURLSHE_BAD_OPTION = = nSHCode | | CURLSHE_NOT_BUILT_IN = = nSHCode ) {
S3FS_PRN_WARN ( " curl_share_setopt(DNS) returns %d(%s), but continue without shared dns data. " , nSHCode , curl_share_strerror ( nSHCode ) ) ;
}
}
if ( S3fsCurl : : is_ssl_session_cache ) {
nSHCode = curl_share_setopt ( S3fsCurl : : hCurlShare , CURLSHOPT_SHARE , CURL_LOCK_DATA_SSL_SESSION ) ;
if ( CURLSHE_OK ! = nSHCode & & CURLSHE_BAD_OPTION ! = nSHCode & & CURLSHE_NOT_BUILT_IN ! = nSHCode ) {
S3FS_PRN_ERR ( " curl_share_setopt(SSL SESSION) returns %d(%s) " , nSHCode , curl_share_strerror ( nSHCode ) ) ;
return false ;
} else if ( CURLSHE_BAD_OPTION = = nSHCode | | CURLSHE_NOT_BUILT_IN = = nSHCode ) {
S3FS_PRN_WARN ( " curl_share_setopt(SSL SESSION) returns %d(%s), but continue without shared ssl session data. " , nSHCode , curl_share_strerror ( nSHCode ) ) ;
}
}
if ( CURLSHE_OK ! = ( nSHCode = curl_share_setopt ( S3fsCurl : : hCurlShare , CURLSHOPT_USERDATA , & S3fsCurl : : callback_locks ) ) ) {
S3FS_PRN_ERR ( " curl_share_setopt(USERDATA) returns %d(%s) " , nSHCode , curl_share_strerror ( nSHCode ) ) ;
return false ;
}
return true ;
2013-07-05 02:28:31 +00:00
}
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : DestroyShareCurl ( )
2013-07-05 02:28:31 +00:00
{
2020-08-22 12:40:53 +00:00
if ( ! S3fsCurl : : hCurlShare ) {
if ( ! S3fsCurl : : is_dns_cache & & ! S3fsCurl : : is_ssl_session_cache ) {
return true ;
}
S3FS_PRN_WARN ( " already destroy share curl. " ) ;
return false ;
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
if ( CURLSHE_OK ! = curl_share_cleanup ( S3fsCurl : : hCurlShare ) ) {
return false ;
}
S3fsCurl : : hCurlShare = NULL ;
return true ;
2013-06-01 15:31:31 +00:00
}
2013-07-05 02:28:31 +00:00
void S3fsCurl : : LockCurlShare ( CURL * handle , curl_lock_data nLockData , curl_lock_access laccess , void * useptr )
2013-05-22 08:49:23 +00:00
{
2020-08-22 12:40:53 +00:00
if ( ! hCurlShare ) {
return ;
}
S3fsCurl : : callback_locks_t * locks = static_cast < S3fsCurl : : callback_locks_t * > ( useptr ) ;
2021-01-24 22:56:10 +00:00
int result ;
2020-08-22 12:40:53 +00:00
if ( CURL_LOCK_DATA_DNS = = nLockData ) {
2021-01-24 22:56:10 +00:00
if ( 0 ! = ( result = pthread_mutex_lock ( & locks - > dns ) ) ) {
S3FS_PRN_CRIT ( " pthread_mutex_lock returned: %d " , result ) ;
2020-08-22 12:40:53 +00:00
abort ( ) ;
}
} else if ( CURL_LOCK_DATA_SSL_SESSION = = nLockData ) {
2021-01-24 22:56:10 +00:00
if ( 0 ! = ( result = pthread_mutex_lock ( & locks - > ssl_session ) ) ) {
S3FS_PRN_CRIT ( " pthread_mutex_lock returned: %d " , result ) ;
2020-08-22 12:40:53 +00:00
abort ( ) ;
}
}
2013-05-22 08:49:23 +00:00
}
2013-07-05 02:28:31 +00:00
void S3fsCurl : : UnlockCurlShare ( CURL * handle , curl_lock_data nLockData , void * useptr )
2013-05-22 08:49:23 +00:00
{
2020-08-22 12:40:53 +00:00
if ( ! hCurlShare ) {
return ;
}
S3fsCurl : : callback_locks_t * locks = static_cast < S3fsCurl : : callback_locks_t * > ( useptr ) ;
2021-01-24 22:56:10 +00:00
int result ;
2020-08-22 12:40:53 +00:00
if ( CURL_LOCK_DATA_DNS = = nLockData ) {
2021-01-24 22:56:10 +00:00
if ( 0 ! = ( result = pthread_mutex_unlock ( & locks - > dns ) ) ) {
S3FS_PRN_CRIT ( " pthread_mutex_unlock returned: %d " , result ) ;
2020-08-22 12:40:53 +00:00
abort ( ) ;
}
} else if ( CURL_LOCK_DATA_SSL_SESSION = = nLockData ) {
2021-01-24 22:56:10 +00:00
if ( 0 ! = ( result = pthread_mutex_unlock ( & locks - > ssl_session ) ) ) {
S3FS_PRN_CRIT ( " pthread_mutex_unlock returned: %d " , result ) ;
2020-08-22 12:40:53 +00:00
abort ( ) ;
}
}
2013-05-22 08:49:23 +00:00
}
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : InitCryptMutex ( )
2013-08-27 08:12:01 +00:00
{
2020-08-22 12:40:53 +00:00
return s3fs_init_crypt_mutex ( ) ;
2013-08-27 08:12:01 +00:00
}
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : DestroyCryptMutex ( )
2013-08-27 08:12:01 +00:00
{
2020-08-22 12:40:53 +00:00
return s3fs_destroy_crypt_mutex ( ) ;
2013-08-27 08:12:01 +00:00
}
2013-07-05 02:28:31 +00:00
// homegrown timeout mechanism
int S3fsCurl : : CurlProgress ( void * clientp , double dltotal , double dlnow , double ultotal , double ulnow )
2013-05-22 08:49:23 +00:00
{
2020-08-22 12:40:53 +00:00
CURL * curl = static_cast < CURL * > ( clientp ) ;
time_t now = time ( 0 ) ;
progress_t p ( dlnow , ulnow ) ;
AutoLock lock ( & S3fsCurl : : curl_handles_lock ) ;
// any progress?
if ( p ! = S3fsCurl : : curl_progress [ curl ] ) {
// yes!
S3fsCurl : : curl_times [ curl ] = now ;
S3fsCurl : : curl_progress [ curl ] = p ;
} else {
// timeout?
if ( now - S3fsCurl : : curl_times [ curl ] > readwrite_timeout ) {
S3FS_PRN_ERR ( " timeout now: %lld, curl_times[curl]: %lld, readwrite_timeout: %lld " ,
static_cast < long long > ( now ) , static_cast < long long > ( ( S3fsCurl : : curl_times [ curl ] ) ) , static_cast < long long > ( readwrite_timeout ) ) ;
return CURLE_ABORTED_BY_CALLBACK ;
}
}
return 0 ;
2013-07-05 02:28:31 +00:00
}
2020-03-19 15:13:21 +00:00
bool S3fsCurl : : InitMimeType ( const std : : string & strFile )
2013-07-05 02:28:31 +00:00
{
2020-09-11 09:37:24 +00:00
std : : string MimeFile ;
2020-08-22 12:40:53 +00:00
if ( ! strFile . empty ( ) ) {
MimeFile = strFile ;
} else {
// search default mime.types
2020-09-11 09:37:24 +00:00
std : : string errPaths = DEFAULT_MIME_FILE ;
2020-08-22 12:40:53 +00:00
struct stat st ;
if ( 0 = = stat ( DEFAULT_MIME_FILE , & st ) ) {
MimeFile = DEFAULT_MIME_FILE ;
} else if ( compare_sysname ( " Darwin " ) ) {
2021-06-27 02:22:33 +00:00
// for macOS, search another default file.
2020-08-22 12:40:53 +00:00
if ( 0 = = stat ( SPECIAL_DARWIN_MIME_FILE , & st ) ) {
MimeFile = SPECIAL_DARWIN_MIME_FILE ;
} else {
errPaths + = " and " ;
errPaths + = SPECIAL_DARWIN_MIME_FILE ;
}
}
if ( MimeFile . empty ( ) ) {
S3FS_PRN_WARN ( " Could not find mime.types files, you have to create file(%s) or specify mime option for existing mime.types file. " , errPaths . c_str ( ) ) ;
return false ;
}
2020-03-19 15:13:21 +00:00
}
2020-08-22 12:40:53 +00:00
S3FS_PRN_DBG ( " Try to load mime types from %s file. " , MimeFile . c_str ( ) ) ;
2020-09-11 09:37:24 +00:00
std : : string line ;
std : : ifstream MT ( MimeFile . c_str ( ) ) ;
2020-08-22 12:40:53 +00:00
if ( MT . good ( ) ) {
S3FS_PRN_DBG ( " The old mime types are cleared to load new mime types. " ) ;
S3fsCurl : : mimeTypes . clear ( ) ;
while ( getline ( MT , line ) ) {
2021-04-12 22:36:09 +00:00
if ( line . empty ( ) ) {
2020-08-22 12:40:53 +00:00
continue ;
}
2021-04-12 22:36:09 +00:00
if ( line [ 0 ] = = ' # ' ) {
2020-08-22 12:40:53 +00:00
continue ;
}
2020-09-11 09:37:24 +00:00
std : : istringstream tmp ( line ) ;
std : : string mimeType ;
2020-08-22 12:40:53 +00:00
tmp > > mimeType ;
2021-06-01 22:14:32 +00:00
std : : string ext ;
while ( tmp > > ext ) {
2020-08-22 12:40:53 +00:00
S3fsCurl : : mimeTypes [ ext ] = mimeType ;
}
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
S3FS_PRN_INIT_INFO ( " Loaded mime information from %s " , MimeFile . c_str ( ) ) ;
} else {
S3FS_PRN_WARN ( " Could not load mime types from %s, please check the existence and permissions of this file. " , MimeFile . c_str ( ) ) ;
return false ;
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
return true ;
2013-07-05 02:28:31 +00:00
}
2019-01-23 23:44:50 +00:00
void S3fsCurl : : InitUserAgent ( )
2017-09-17 09:16:05 +00:00
{
2020-08-22 12:40:53 +00:00
if ( S3fsCurl : : userAgent . empty ( ) ) {
S3fsCurl : : userAgent = " s3fs/ " ;
S3fsCurl : : userAgent + = VERSION ;
S3fsCurl : : userAgent + = " (commit hash " ;
S3fsCurl : : userAgent + = COMMIT_HASH_VAL ;
S3fsCurl : : userAgent + = " ; " ;
S3fsCurl : : userAgent + = s3fs_crypt_lib_name ( ) ;
S3fsCurl : : userAgent + = " ) " ;
S3fsCurl : : userAgent + = instance_name ;
}
2017-09-17 09:16:05 +00:00
}
2013-07-05 02:28:31 +00:00
//
// @param s e.g., "index.html"
// @return e.g., "text/html"
//
2020-09-11 09:37:24 +00:00
std : : string S3fsCurl : : LookupMimeType ( const std : : string & name )
2013-07-05 02:28:31 +00:00
{
2020-08-22 12:40:53 +00:00
if ( ! name . empty ( ) & & name [ name . size ( ) - 1 ] = = ' / ' ) {
return " application/x-directory " ;
}
2020-02-02 09:43:20 +00:00
2020-09-11 09:37:24 +00:00
std : : string result ( " application/octet-stream " ) ;
std : : string : : size_type last_pos = name . find_last_of ( ' . ' ) ;
std : : string : : size_type first_pos = name . find_first_of ( ' . ' ) ;
std : : string prefix , ext , ext2 ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
// No dots in name, just return
2020-09-11 09:37:24 +00:00
if ( last_pos = = std : : string : : npos ) {
2020-08-22 12:40:53 +00:00
return result ;
}
// extract the last extension
2020-09-11 09:37:24 +00:00
ext = name . substr ( 1 + last_pos , std : : string : : npos ) ;
2020-08-22 12:40:53 +00:00
2020-09-11 09:37:24 +00:00
if ( last_pos ! = std : : string : : npos ) {
2020-08-22 12:40:53 +00:00
// one dot was found, now look for another
2020-09-11 09:37:24 +00:00
if ( first_pos ! = std : : string : : npos & & first_pos < last_pos ) {
2020-08-22 12:40:53 +00:00
prefix = name . substr ( 0 , last_pos ) ;
// Now get the second to last file extension
2020-09-11 09:37:24 +00:00
std : : string : : size_type next_pos = prefix . find_last_of ( ' . ' ) ;
if ( next_pos ! = std : : string : : npos ) {
ext2 = prefix . substr ( 1 + next_pos , std : : string : : npos ) ;
2020-08-22 12:40:53 +00:00
}
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
}
// if we get here, then we have an extension (ext)
mimes_t : : const_iterator iter = S3fsCurl : : mimeTypes . find ( ext ) ;
// if the last extension matches a mimeType, then return
// that mime type
if ( iter ! = S3fsCurl : : mimeTypes . end ( ) ) {
result = ( * iter ) . second ;
return result ;
}
// return with the default result if there isn't a second extension
if ( first_pos = = last_pos ) {
return result ;
}
// Didn't find a mime-type for the first extension
// Look for second extension in mimeTypes, return if found
iter = S3fsCurl : : mimeTypes . find ( ext2 ) ;
if ( iter ! = S3fsCurl : : mimeTypes . end ( ) ) {
result = ( * iter ) . second ;
return result ;
}
// neither the last extension nor the second-to-last extension
// matched a mimeType, return the default mime type
2013-07-05 02:28:31 +00:00
return result ;
2013-05-22 08:49:23 +00:00
}
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : LocateBundle ( )
2013-05-22 08:49:23 +00:00
{
2020-08-22 12:40:53 +00:00
// See if environment variable CURL_CA_BUNDLE is set
// if so, check it, if it is a good path, then set the
// curl_ca_bundle variable to it
if ( S3fsCurl : : curl_ca_bundle . empty ( ) ) {
char * CURL_CA_BUNDLE = getenv ( " CURL_CA_BUNDLE " ) ;
if ( CURL_CA_BUNDLE ! = NULL ) {
// check for existence and readability of the file
2020-09-11 09:37:24 +00:00
std : : ifstream BF ( CURL_CA_BUNDLE ) ;
2020-08-22 12:40:53 +00:00
if ( ! BF . good ( ) ) {
S3FS_PRN_ERR ( " %s: file specified by CURL_CA_BUNDLE environment variable is not readable " , program_name . c_str ( ) ) ;
return false ;
}
BF . close ( ) ;
2020-09-25 03:15:04 +00:00
S3fsCurl : : curl_ca_bundle = CURL_CA_BUNDLE ;
2020-08-22 12:40:53 +00:00
return true ;
}
} else {
// Already set ca bundle variable
return true ;
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
// not set via environment variable, look in likely locations
///////////////////////////////////////////
// following comment from curl's (7.21.2) acinclude.m4 file
///////////////////////////////////////////
// dnl CURL_CHECK_CA_BUNDLE
// dnl -------------------------------------------------
// dnl Check if a default ca-bundle should be used
// dnl
// dnl regarding the paths this will scan:
// dnl /etc/ssl/certs/ca-certificates.crt Debian systems
// dnl /etc/pki/tls/certs/ca-bundle.crt Redhat and Mandriva
// dnl /usr/share/ssl/certs/ca-bundle.crt old(er) Redhat
// dnl /usr/local/share/certs/ca-root.crt FreeBSD
// dnl /etc/ssl/cert.pem OpenBSD
// dnl /etc/ssl/certs/ (ca path) SUSE
///////////////////////////////////////////
// Within CURL the above path should have been checked
// according to the OS. Thus, although we do not need
// to check files here, we will only examine some files.
//
2020-09-11 09:37:24 +00:00
std : : ifstream BF ( " /etc/pki/tls/certs/ca-bundle.crt " ) ;
2017-11-05 11:26:05 +00:00
if ( BF . good ( ) ) {
BF . close ( ) ;
2020-09-25 03:15:04 +00:00
S3fsCurl : : curl_ca_bundle = " /etc/pki/tls/certs/ca-bundle.crt " ;
2020-08-22 12:40:53 +00:00
} else {
BF . open ( " /etc/ssl/certs/ca-certificates.crt " ) ;
2017-11-05 11:26:05 +00:00
if ( BF . good ( ) ) {
2020-08-22 12:40:53 +00:00
BF . close ( ) ;
2020-09-25 03:15:04 +00:00
S3fsCurl : : curl_ca_bundle = " /etc/ssl/certs/ca-certificates.crt " ;
2017-11-05 11:26:05 +00:00
} else {
2020-08-22 12:40:53 +00:00
BF . open ( " /usr/share/ssl/certs/ca-bundle.crt " ) ;
if ( BF . good ( ) ) {
BF . close ( ) ;
2020-09-25 03:15:04 +00:00
S3fsCurl : : curl_ca_bundle = " /usr/share/ssl/certs/ca-bundle.crt " ;
2020-08-22 12:40:53 +00:00
} else {
BF . open ( " /usr/local/share/certs/ca-root.crt " ) ;
if ( BF . good ( ) ) {
BF . close ( ) ;
2020-09-25 03:15:04 +00:00
S3fsCurl : : curl_ca_bundle = " /usr/share/ssl/certs/ca-bundle.crt " ;
2020-08-22 12:40:53 +00:00
} else {
S3FS_PRN_ERR ( " %s: /.../ca-bundle.crt is not readable " , program_name . c_str ( ) ) ;
return false ;
}
}
2017-11-05 11:26:05 +00:00
}
}
2020-08-22 12:40:53 +00:00
return true ;
2013-07-05 02:28:31 +00:00
}
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
size_t S3fsCurl : : WriteMemoryCallback ( void * ptr , size_t blockSize , size_t numBlocks , void * data )
2013-07-05 02:28:31 +00:00
{
2020-08-22 12:40:53 +00:00
BodyData * body = static_cast < BodyData * > ( data ) ;
if ( ! body - > Append ( ptr , blockSize , numBlocks ) ) {
S3FS_PRN_CRIT ( " BodyData.Append() returned false. " ) ;
S3FS_FUSE_EXIT ( ) ;
return - 1 ;
}
return ( blockSize * numBlocks ) ;
2013-05-22 08:49:23 +00:00
}
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
size_t S3fsCurl : : ReadCallback ( void * ptr , size_t size , size_t nmemb , void * userp )
2013-05-22 08:49:23 +00:00
{
2021-01-24 23:15:17 +00:00
S3fsCurl * pCurl = static_cast < S3fsCurl * > ( userp ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
if ( 1 > ( size * nmemb ) ) {
return 0 ;
}
if ( 0 > = pCurl - > postdata_remaining ) {
return 0 ;
}
2021-06-13 03:50:07 +00:00
size_t copysize = std : : min ( static_cast < off_t > ( size * nmemb ) , pCurl - > postdata_remaining ) ;
2020-08-22 12:40:53 +00:00
memcpy ( ptr , pCurl - > postdata , copysize ) ;
2013-07-05 02:28:31 +00:00
2021-06-13 03:50:07 +00:00
pCurl - > postdata_remaining = ( pCurl - > postdata_remaining > static_cast < off_t > ( copysize ) ? ( pCurl - > postdata_remaining - copysize ) : 0 ) ;
2020-08-22 12:40:53 +00:00
pCurl - > postdata + = static_cast < size_t > ( copysize ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
return copysize ;
2013-05-22 08:49:23 +00:00
}
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
size_t S3fsCurl : : HeaderCallback ( void * data , size_t blockSize , size_t numBlocks , void * userPtr )
2013-03-30 13:37:14 +00:00
{
2021-01-24 23:15:17 +00:00
headers_t * headers = static_cast < headers_t * > ( userPtr ) ;
std : : string header ( static_cast < char * > ( data ) , blockSize * numBlocks ) ;
2020-09-11 09:37:24 +00:00
std : : string key ;
std : : istringstream ss ( header ) ;
2020-08-22 12:40:53 +00:00
if ( getline ( ss , key , ' : ' ) ) {
// Force to lower, only "x-amz"
2020-09-11 09:37:24 +00:00
std : : string lkey = key ;
2020-08-22 12:40:53 +00:00
transform ( lkey . begin ( ) , lkey . end ( ) , lkey . begin ( ) , static_cast < int ( * ) ( int ) > ( std : : tolower ) ) ;
2020-09-26 05:09:20 +00:00
if ( is_prefix ( lkey . c_str ( ) , " x-amz " ) ) {
2020-08-22 12:40:53 +00:00
key = lkey ;
}
2020-09-11 09:37:24 +00:00
std : : string value ;
2020-08-22 12:40:53 +00:00
getline ( ss , value ) ;
( * headers ) [ key ] = trim ( value ) ;
}
return blockSize * numBlocks ;
2011-09-01 19:24:12 +00:00
}
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
size_t S3fsCurl : : UploadReadCallback ( void * ptr , size_t size , size_t nmemb , void * userp )
2013-07-12 00:33:36 +00:00
{
2021-01-24 23:15:17 +00:00
S3fsCurl * pCurl = static_cast < S3fsCurl * > ( userp ) ;
2013-07-12 00:33:36 +00:00
2020-08-22 12:40:53 +00:00
if ( 1 > ( size * nmemb ) ) {
return 0 ;
}
if ( - 1 = = pCurl - > partdata . fd | | 0 > = pCurl - > partdata . size ) {
return 0 ;
}
// read size
ssize_t copysize = ( size * nmemb ) < ( size_t ) pCurl - > partdata . size ? ( size * nmemb ) : ( size_t ) pCurl - > partdata . size ;
ssize_t readbytes ;
ssize_t totalread ;
// read and set
for ( totalread = 0 , readbytes = 0 ; totalread < copysize ; totalread + = readbytes ) {
readbytes = pread ( pCurl - > partdata . fd , & ( ( char * ) ptr ) [ totalread ] , ( copysize - totalread ) , pCurl - > partdata . startpos + totalread ) ;
if ( 0 = = readbytes ) {
// eof
break ;
} else if ( - 1 = = readbytes ) {
// error
S3FS_PRN_ERR ( " read file error(%d). " , errno ) ;
return 0 ;
}
}
pCurl - > partdata . startpos + = totalread ;
pCurl - > partdata . size - = totalread ;
return totalread ;
2013-07-12 00:33:36 +00:00
}
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
size_t S3fsCurl : : DownloadWriteCallback ( void * ptr , size_t size , size_t nmemb , void * userp )
{
2021-01-24 23:15:17 +00:00
S3fsCurl * pCurl = static_cast < S3fsCurl * > ( userp ) ;
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
2020-08-22 12:40:53 +00:00
if ( 1 > ( size * nmemb ) ) {
return 0 ;
}
if ( - 1 = = pCurl - > partdata . fd | | 0 > = pCurl - > partdata . size ) {
return 0 ;
}
2021-06-30 00:25:36 +00:00
// Buffer initial bytes in case it is an XML error response.
if ( pCurl - > bodydata . size ( ) < GET_OBJECT_RESPONSE_LIMIT ) {
pCurl - > bodydata . Append ( ptr , std : : min ( size * nmemb , GET_OBJECT_RESPONSE_LIMIT - pCurl - > bodydata . size ( ) ) ) ;
}
2020-08-22 12:40:53 +00:00
// write size
ssize_t copysize = ( size * nmemb ) < ( size_t ) pCurl - > partdata . size ? ( size * nmemb ) : ( size_t ) pCurl - > partdata . size ;
ssize_t writebytes ;
ssize_t totalwrite ;
// write
for ( totalwrite = 0 , writebytes = 0 ; totalwrite < copysize ; totalwrite + = writebytes ) {
writebytes = pwrite ( pCurl - > partdata . fd , & ( ( char * ) ptr ) [ totalwrite ] , ( copysize - totalwrite ) , pCurl - > partdata . startpos + totalwrite ) ;
if ( 0 = = writebytes ) {
// eof?
break ;
} else if ( - 1 = = writebytes ) {
// error
S3FS_PRN_ERR ( " write file error(%d). " , errno ) ;
return 0 ;
}
}
pCurl - > partdata . startpos + = totalwrite ;
pCurl - > partdata . size - = totalwrite ;
return totalwrite ;
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
}
2020-08-22 12:40:53 +00:00
bool S3fsCurl : : SetCheckCertificate ( bool isCertCheck )
{
bool old = S3fsCurl : : is_cert_check ;
S3fsCurl : : is_cert_check = isCertCheck ;
return old ;
2015-05-20 15:32:36 +00:00
}
2013-07-05 02:28:31 +00:00
bool S3fsCurl : : SetDnsCache ( bool isCache )
2013-03-30 13:37:14 +00:00
{
2020-08-22 12:40:53 +00:00
bool old = S3fsCurl : : is_dns_cache ;
S3fsCurl : : is_dns_cache = isCache ;
return old ;
2013-07-05 02:28:31 +00:00
}
2011-03-01 19:35:55 +00:00
2021-01-04 13:57:56 +00:00
void S3fsCurl : : ResetOffset ( S3fsCurl * pCurl )
{
pCurl - > partdata . startpos = pCurl - > b_partdata_startpos ;
pCurl - > partdata . size = pCurl - > b_partdata_size ;
}
2013-09-14 21:50:39 +00:00
bool S3fsCurl : : SetSslSessionCache ( bool isCache )
{
2020-08-22 12:40:53 +00:00
bool old = S3fsCurl : : is_ssl_session_cache ;
S3fsCurl : : is_ssl_session_cache = isCache ;
return old ;
2013-09-14 21:50:39 +00:00
}
2013-07-05 02:28:31 +00:00
long S3fsCurl : : SetConnectTimeout ( long timeout )
{
2020-08-22 12:40:53 +00:00
long old = S3fsCurl : : connect_timeout ;
S3fsCurl : : connect_timeout = timeout ;
return old ;
2013-07-05 02:28:31 +00:00
}
2011-03-01 19:35:55 +00:00
2013-07-05 02:28:31 +00:00
time_t S3fsCurl : : SetReadwriteTimeout ( time_t timeout )
{
2020-08-22 12:40:53 +00:00
time_t old = S3fsCurl : : readwrite_timeout ;
S3fsCurl : : readwrite_timeout = timeout ;
return old ;
2011-03-01 19:35:55 +00:00
}
2013-07-05 02:28:31 +00:00
int S3fsCurl : : SetRetries ( int count )
2013-03-30 13:37:14 +00:00
{
2020-08-22 12:40:53 +00:00
int old = S3fsCurl : : retries ;
S3fsCurl : : retries = count ;
return old ;
2013-07-05 02:28:31 +00:00
}
2011-03-01 19:35:55 +00:00
2013-07-05 02:28:31 +00:00
bool S3fsCurl : : SetPublicBucket ( bool flag )
{
2020-08-22 12:40:53 +00:00
bool old = S3fsCurl : : is_public_bucket ;
S3fsCurl : : is_public_bucket = flag ;
return old ;
2011-03-01 19:35:55 +00:00
}
2019-08-09 20:51:01 +00:00
acl_t S3fsCurl : : SetDefaultAcl ( acl_t acl )
2013-03-30 13:37:14 +00:00
{
2020-08-22 12:40:53 +00:00
acl_t old = S3fsCurl : : default_acl ;
S3fsCurl : : default_acl = acl ;
return old ;
2013-07-05 02:28:31 +00:00
}
2011-08-31 22:20:20 +00:00
2019-08-09 20:51:01 +00:00
acl_t S3fsCurl : : GetDefaultAcl ( )
2017-11-23 08:46:24 +00:00
{
2020-08-22 12:40:53 +00:00
return S3fsCurl : : default_acl ;
2017-11-23 08:46:24 +00:00
}
2021-05-21 14:34:31 +00:00
std : : string S3fsCurl : : SetStorageClass ( const std : : string & storage_class )
2013-07-05 02:28:31 +00:00
{
2021-05-21 14:34:31 +00:00
std : : string old = S3fsCurl : : storage_class ;
2020-08-22 12:40:53 +00:00
S3fsCurl : : storage_class = storage_class ;
2021-06-30 00:03:31 +00:00
// AWS requires uppercase storage class values
transform ( S3fsCurl : : storage_class . begin ( ) , S3fsCurl : : storage_class . end ( ) , S3fsCurl : : storage_class . begin ( ) , : : toupper ) ;
2020-08-22 12:40:53 +00:00
return old ;
2011-08-31 22:20:20 +00:00
}
2020-09-14 08:47:21 +00:00
bool S3fsCurl : : PushbackSseKeys ( const std : : string & input )
2014-08-26 17:11:10 +00:00
{
2020-09-14 08:47:21 +00:00
std : : string onekey = trim ( input ) ;
2020-08-22 12:40:53 +00:00
if ( onekey . empty ( ) ) {
return false ;
2019-01-05 21:08:41 +00:00
}
2020-08-22 12:40:53 +00:00
if ( ' # ' = = onekey [ 0 ] ) {
return false ;
2019-01-05 21:08:41 +00:00
}
2020-08-22 12:40:53 +00:00
// make base64 if the key is short enough, otherwise assume it is already so
2020-09-11 09:37:24 +00:00
std : : string base64_key ;
std : : string raw_key ;
2020-08-22 12:40:53 +00:00
if ( onekey . length ( ) > 256 / 8 ) {
char * p_key ;
size_t keylength ;
2021-08-31 00:22:10 +00:00
if ( NULL ! = ( p_key = ( char * ) s3fs_decode64 ( onekey . c_str ( ) , onekey . size ( ) , & keylength ) ) ) {
2020-09-11 09:37:24 +00:00
raw_key = std : : string ( p_key , keylength ) ;
2020-08-22 12:40:53 +00:00
base64_key = onekey ;
delete [ ] p_key ;
} else {
S3FS_PRN_ERR ( " Failed to convert base64 to SSE-C key %s " , onekey . c_str ( ) ) ;
return false ;
}
} else {
char * pbase64_key ;
2014-08-26 17:11:10 +00:00
2020-08-22 12:40:53 +00:00
if ( NULL ! = ( pbase64_key = s3fs_base64 ( ( unsigned char * ) onekey . c_str ( ) , onekey . length ( ) ) ) ) {
raw_key = onekey ;
base64_key = pbase64_key ;
delete [ ] pbase64_key ;
} else {
S3FS_PRN_ERR ( " Failed to convert base64 from SSE-C key %s " , onekey . c_str ( ) ) ;
return false ;
}
}
// make MD5
2020-09-11 09:37:24 +00:00
std : : string strMd5 ;
2020-08-22 12:40:53 +00:00
if ( ! make_md5_from_binary ( raw_key . c_str ( ) , raw_key . length ( ) , strMd5 ) ) {
S3FS_PRN_ERR ( " Could not make MD5 from SSE-C keys(%s). " , raw_key . c_str ( ) ) ;
return false ;
}
// mapped MD5 = SSE Key
sseckeymap_t md5map ;
md5map . clear ( ) ;
md5map [ strMd5 ] = base64_key ;
S3fsCurl : : sseckeys . push_back ( md5map ) ;
return true ;
2014-08-26 17:11:10 +00:00
}
2015-10-06 14:46:14 +00:00
sse_type_t S3fsCurl : : SetSseType ( sse_type_t type )
{
2020-08-22 12:40:53 +00:00
sse_type_t old = S3fsCurl : : ssetype ;
S3fsCurl : : ssetype = type ;
return old ;
2015-10-06 14:46:14 +00:00
}
bool S3fsCurl : : SetSseCKeys ( const char * filepath )
2014-07-19 19:02:55 +00:00
{
2020-08-22 12:40:53 +00:00
if ( ! filepath ) {
S3FS_PRN_ERR ( " SSE-C keys filepath is empty. " ) ;
return false ;
}
struct stat st ;
if ( 0 ! = stat ( filepath , & st ) ) {
S3FS_PRN_ERR ( " could not open use_sse keys file(%s). " , filepath ) ;
return false ;
}
if ( st . st_mode & ( S_IXUSR | S_IRWXG | S_IRWXO ) ) {
S3FS_PRN_ERR ( " use_sse keys file %s should be 0600 permissions. " , filepath ) ;
return false ;
}
2015-10-06 14:46:14 +00:00
2020-08-22 12:40:53 +00:00
S3fsCurl : : sseckeys . clear ( ) ;
2014-07-19 19:02:55 +00:00
2020-09-11 09:37:24 +00:00
std : : ifstream ssefs ( filepath ) ;
2020-08-22 12:40:53 +00:00
if ( ! ssefs . good ( ) ) {
S3FS_PRN_ERR ( " Could not open SSE-C keys file(%s). " , filepath ) ;
return false ;
}
2020-09-11 09:37:24 +00:00
std : : string line ;
2020-08-22 12:40:53 +00:00
while ( getline ( ssefs , line ) ) {
S3fsCurl : : PushbackSseKeys ( line ) ;
}
if ( S3fsCurl : : sseckeys . empty ( ) ) {
S3FS_PRN_ERR ( " There is no SSE Key in file(%s). " , filepath ) ;
return false ;
}
return true ;
2014-07-19 19:02:55 +00:00
}
2015-10-06 14:46:14 +00:00
bool S3fsCurl : : SetSseKmsid ( const char * kmsid )
{
2020-08-22 12:40:53 +00:00
if ( ! kmsid | | ' \0 ' = = kmsid [ 0 ] ) {
S3FS_PRN_ERR ( " SSE-KMS kms id is empty. " ) ;
return false ;
}
S3fsCurl : : ssekmsid = kmsid ;
return true ;
2015-10-06 14:46:14 +00:00
}
// [NOTE]
// Because SSE is set by some options and environment,
// this function check the integrity of the SSE data finally.
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : FinalCheckSse ( )
2015-10-06 14:46:14 +00:00
{
2020-08-22 12:40:53 +00:00
switch ( S3fsCurl : : ssetype ) {
case sse_type_t : : SSE_DISABLE :
S3fsCurl : : ssekmsid . erase ( ) ;
return true ;
case sse_type_t : : SSE_S3 :
S3fsCurl : : ssekmsid . erase ( ) ;
return true ;
case sse_type_t : : SSE_C :
if ( S3fsCurl : : sseckeys . empty ( ) ) {
S3FS_PRN_ERR ( " sse type is SSE-C, but there is no custom key. " ) ;
return false ;
}
S3fsCurl : : ssekmsid . erase ( ) ;
return true ;
case sse_type_t : : SSE_KMS :
if ( S3fsCurl : : ssekmsid . empty ( ) ) {
S3FS_PRN_ERR ( " sse type is SSE-KMS, but there is no specified kms id. " ) ;
return false ;
}
2020-10-01 09:50:49 +00:00
if ( S3fsCurl : : GetSignatureType ( ) = = V2_ONLY ) {
2020-08-22 12:40:53 +00:00
S3FS_PRN_ERR ( " sse type is SSE-KMS, but signature type is not v4. SSE-KMS require signature v4. " ) ;
return false ;
}
return true ;
}
S3FS_PRN_ERR ( " sse type is unknown(%d). " , static_cast < int > ( S3fsCurl : : ssetype ) ) ;
return false ;
2015-10-06 14:46:14 +00:00
}
2014-08-26 17:11:10 +00:00
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : LoadEnvSseCKeys ( )
2014-08-26 17:11:10 +00:00
{
2020-08-22 12:40:53 +00:00
char * envkeys = getenv ( " AWSSSECKEYS " ) ;
if ( NULL = = envkeys ) {
// nothing to do
return true ;
}
S3fsCurl : : sseckeys . clear ( ) ;
2020-09-11 09:37:24 +00:00
std : : istringstream fullkeys ( envkeys ) ;
std : : string onekey ;
2020-08-22 12:40:53 +00:00
while ( getline ( fullkeys , onekey , ' : ' ) ) {
S3fsCurl : : PushbackSseKeys ( onekey ) ;
}
if ( S3fsCurl : : sseckeys . empty ( ) ) {
S3FS_PRN_ERR ( " There is no SSE Key in environment(AWSSSECKEYS=%s). " , envkeys ) ;
return false ;
}
2015-10-06 14:46:14 +00:00
return true ;
2014-08-26 17:11:10 +00:00
}
2014-07-19 19:02:55 +00:00
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : LoadEnvSseKmsid ( )
2015-10-06 14:46:14 +00:00
{
2020-08-22 12:40:53 +00:00
char * envkmsid = getenv ( " AWSSSEKMSID " ) ;
if ( NULL = = envkmsid ) {
// nothing to do
return true ;
}
return S3fsCurl : : SetSseKmsid ( envkmsid ) ;
2015-10-06 14:46:14 +00:00
}
2014-07-19 19:02:55 +00:00
//
// If md5 is empty, returns first(current) sse key.
//
2020-09-11 09:37:24 +00:00
bool S3fsCurl : : GetSseKey ( std : : string & md5 , std : : string & ssekey )
2014-07-19 19:02:55 +00:00
{
2020-08-22 12:40:53 +00:00
for ( sseckeylist_t : : const_iterator iter = S3fsCurl : : sseckeys . begin ( ) ; iter ! = S3fsCurl : : sseckeys . end ( ) ; + + iter ) {
2021-06-13 11:03:10 +00:00
if ( md5 . empty ( ) | | md5 = = ( * iter ) . begin ( ) - > first ) {
2020-08-22 12:40:53 +00:00
md5 = iter - > begin ( ) - > first ;
ssekey = iter - > begin ( ) - > second ;
return true ;
}
}
return false ;
2014-07-19 19:02:55 +00:00
}
2021-06-13 03:50:07 +00:00
bool S3fsCurl : : GetSseKeyMd5 ( size_t pos , std : : string & md5 )
2014-07-19 19:02:55 +00:00
{
2020-08-22 12:40:53 +00:00
if ( S3fsCurl : : sseckeys . size ( ) < = static_cast < size_t > ( pos ) ) {
return false ;
}
2021-06-13 03:50:07 +00:00
size_t cnt = 0 ;
2020-08-22 12:40:53 +00:00
for ( sseckeylist_t : : const_iterator iter = S3fsCurl : : sseckeys . begin ( ) ; iter ! = S3fsCurl : : sseckeys . end ( ) ; + + iter , + + cnt ) {
if ( pos = = cnt ) {
md5 = iter - > begin ( ) - > first ;
return true ;
}
}
2014-07-19 19:02:55 +00:00
return false ;
}
2021-06-13 03:50:07 +00:00
size_t S3fsCurl : : GetSseKeyCount ( )
2014-07-19 19:02:55 +00:00
{
2020-08-22 12:40:53 +00:00
return S3fsCurl : : sseckeys . size ( ) ;
2014-07-19 19:02:55 +00:00
}
2013-07-05 02:28:31 +00:00
bool S3fsCurl : : SetContentMd5 ( bool flag )
{
2020-08-22 12:40:53 +00:00
bool old = S3fsCurl : : is_content_md5 ;
S3fsCurl : : is_content_md5 = flag ;
return old ;
2013-07-05 02:28:31 +00:00
}
2013-08-23 16:28:50 +00:00
bool S3fsCurl : : SetVerbose ( bool flag )
{
2020-08-22 12:40:53 +00:00
bool old = S3fsCurl : : is_verbose ;
S3fsCurl : : is_verbose = flag ;
return old ;
2013-08-23 16:28:50 +00:00
}
2020-05-24 07:23:27 +00:00
bool S3fsCurl : : SetDumpBody ( bool flag )
{
2020-08-22 12:40:53 +00:00
bool old = S3fsCurl : : is_dump_body ;
S3fsCurl : : is_dump_body = flag ;
return old ;
2020-05-24 07:23:27 +00:00
}
2013-07-05 02:28:31 +00:00
bool S3fsCurl : : SetAccessKey ( const char * AccessKeyId , const char * SecretAccessKey )
{
2020-08-22 12:40:53 +00:00
if ( ( ! S3fsCurl : : is_ibm_iam_auth & & ( ! AccessKeyId | | ' \0 ' = = AccessKeyId [ 0 ] ) ) | | ! SecretAccessKey | | ' \0 ' = = SecretAccessKey [ 0 ] ) {
return false ;
}
AWSAccessKeyId = AccessKeyId ;
AWSSecretAccessKey = SecretAccessKey ;
return true ;
2013-07-05 02:28:31 +00:00
}
2011-09-01 19:24:12 +00:00
2019-04-14 16:19:34 +00:00
bool S3fsCurl : : SetAccessKeyWithSessionToken ( const char * AccessKeyId , const char * SecretAccessKey , const char * SessionToken )
{
2020-08-22 12:40:53 +00:00
bool access_key_is_empty = ! AccessKeyId | | ' \0 ' = = AccessKeyId [ 0 ] ;
bool secret_access_key_is_empty = ! SecretAccessKey | | ' \0 ' = = SecretAccessKey [ 0 ] ;
bool session_token_is_empty = ! SessionToken | | ' \0 ' = = SessionToken [ 0 ] ;
if ( ( ! S3fsCurl : : is_ibm_iam_auth & & access_key_is_empty ) | | secret_access_key_is_empty | | session_token_is_empty ) {
return false ;
}
AWSAccessKeyId = AccessKeyId ;
AWSSecretAccessKey = SecretAccessKey ;
AWSAccessToken = SessionToken ;
S3fsCurl : : is_use_session_token = true ;
return true ;
2019-04-14 16:19:34 +00:00
}
2013-07-05 02:28:31 +00:00
long S3fsCurl : : SetSslVerifyHostname ( long value )
{
2020-08-22 12:40:53 +00:00
if ( 0 ! = value & & 1 ! = value ) {
return - 1 ;
}
long old = S3fsCurl : : ssl_verify_hostname ;
S3fsCurl : : ssl_verify_hostname = value ;
return old ;
2013-07-05 02:28:31 +00:00
}
2011-09-01 19:24:12 +00:00
2017-11-23 08:46:24 +00:00
bool S3fsCurl : : SetIsIBMIAMAuth ( bool flag )
{
2020-08-22 12:40:53 +00:00
bool old = S3fsCurl : : is_ibm_iam_auth ;
S3fsCurl : : is_ibm_iam_auth = flag ;
return old ;
2017-11-23 08:46:24 +00:00
}
2017-11-05 19:24:02 +00:00
bool S3fsCurl : : SetIsECS ( bool flag )
{
2020-08-22 12:40:53 +00:00
bool old = S3fsCurl : : is_ecs ;
S3fsCurl : : is_ecs = flag ;
return old ;
2017-11-05 19:24:02 +00:00
}
2020-09-11 09:37:24 +00:00
std : : string S3fsCurl : : SetIAMRole ( const char * role )
2013-10-06 13:45:32 +00:00
{
2020-09-11 09:37:24 +00:00
std : : string old = S3fsCurl : : IAM_role ;
2020-08-22 12:40:53 +00:00
S3fsCurl : : IAM_role = role ? role : " " ;
return old ;
2013-10-06 13:45:32 +00:00
}
2017-11-23 08:46:24 +00:00
size_t S3fsCurl : : SetIAMFieldCount ( size_t field_count )
{
2020-08-22 12:40:53 +00:00
size_t old = S3fsCurl : : IAM_field_count ;
S3fsCurl : : IAM_field_count = field_count ;
return old ;
2017-11-23 08:46:24 +00:00
}
2020-09-11 09:37:24 +00:00
std : : string S3fsCurl : : SetIAMCredentialsURL ( const char * url )
2017-11-23 08:46:24 +00:00
{
2020-09-11 09:37:24 +00:00
std : : string old = S3fsCurl : : IAM_cred_url ;
2020-08-22 12:40:53 +00:00
S3fsCurl : : IAM_cred_url = url ? url : " " ;
return old ;
2017-11-23 08:46:24 +00:00
}
2020-09-11 09:37:24 +00:00
std : : string S3fsCurl : : SetIAMTokenField ( const char * token_field )
2017-11-23 08:46:24 +00:00
{
2020-09-11 09:37:24 +00:00
std : : string old = S3fsCurl : : IAM_token_field ;
2020-08-22 12:40:53 +00:00
S3fsCurl : : IAM_token_field = token_field ? token_field : " " ;
return old ;
2017-11-23 08:46:24 +00:00
}
2020-09-11 09:37:24 +00:00
std : : string S3fsCurl : : SetIAMExpiryField ( const char * expiry_field )
2017-11-23 08:46:24 +00:00
{
2020-09-11 09:37:24 +00:00
std : : string old = S3fsCurl : : IAM_expiry_field ;
2020-08-22 12:40:53 +00:00
S3fsCurl : : IAM_expiry_field = expiry_field ? expiry_field : " " ;
return old ;
2017-11-23 08:46:24 +00:00
}
2020-10-15 17:18:19 +00:00
bool S3fsCurl : : SetIMDSVersion ( int version )
{
2020-10-30 16:59:55 +00:00
S3fsCurl : : IAM_api_version = version ;
return true ;
2020-10-15 17:18:19 +00:00
}
2014-03-30 07:53:41 +00:00
bool S3fsCurl : : SetMultipartSize ( off_t size )
{
2020-08-22 12:40:53 +00:00
size = size * 1024 * 1024 ;
if ( size < MIN_MULTIPART_SIZE ) {
return false ;
}
S3fsCurl : : multipart_size = size ;
return true ;
2014-03-30 07:53:41 +00:00
}
2021-02-08 11:32:12 +00:00
bool S3fsCurl : : SetMultipartCopySize ( off_t size )
{
size = size * 1024 * 1024 ;
if ( size < MIN_MULTIPART_SIZE ) {
return false ;
}
S3fsCurl : : multipart_copy_size = size ;
return true ;
}
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
int S3fsCurl : : SetMaxParallelCount ( int value )
2013-07-10 06:24:06 +00:00
{
2020-08-22 12:40:53 +00:00
int old = S3fsCurl : : max_parallel_cnt ;
S3fsCurl : : max_parallel_cnt = value ;
return old ;
2013-07-10 06:24:06 +00:00
}
2019-01-19 01:16:56 +00:00
int S3fsCurl : : SetMaxMultiRequest ( int max )
{
2020-08-22 12:40:53 +00:00
int old = S3fsCurl : : max_multireq ;
S3fsCurl : : max_multireq = max ;
return old ;
2019-01-19 01:16:56 +00:00
}
2013-07-10 06:24:06 +00:00
bool S3fsCurl : : UploadMultipartPostCallback ( S3fsCurl * s3fscurl )
{
2020-08-22 12:40:53 +00:00
if ( ! s3fscurl ) {
return false ;
}
2013-07-10 06:24:06 +00:00
2020-08-22 12:40:53 +00:00
return s3fscurl - > UploadMultipartPostComplete ( ) ;
2013-07-10 06:24:06 +00:00
}
2019-09-26 02:30:58 +00:00
bool S3fsCurl : : MixMultipartPostCallback ( S3fsCurl * s3fscurl )
{
2020-08-22 12:40:53 +00:00
if ( ! s3fscurl ) {
return false ;
}
2019-09-26 02:30:58 +00:00
2020-08-22 12:40:53 +00:00
return s3fscurl - > MixMultipartPostComplete ( ) ;
2019-09-26 02:30:58 +00:00
}
2013-07-10 06:24:06 +00:00
S3fsCurl * S3fsCurl : : UploadMultipartPostRetryCallback ( S3fsCurl * s3fscurl )
{
2020-08-22 12:40:53 +00:00
if ( ! s3fscurl ) {
return NULL ;
}
// parse and get part_num, upload_id.
2020-09-11 09:37:24 +00:00
std : : string upload_id ;
std : : string part_num_str ;
2021-06-13 03:50:07 +00:00
int part_num ;
2020-08-22 12:40:53 +00:00
off_t tmp_part_num = 0 ;
if ( ! get_keyword_value ( s3fscurl - > url , " uploadId " , upload_id ) ) {
return NULL ;
}
if ( ! get_keyword_value ( s3fscurl - > url , " partNumber " , part_num_str ) ) {
return NULL ;
}
2020-10-02 01:23:56 +00:00
if ( ! s3fs_strtoofft ( & tmp_part_num , part_num_str . c_str ( ) , /*base=*/ 10 ) ) {
2020-08-22 12:40:53 +00:00
return NULL ;
}
2021-06-13 03:50:07 +00:00
part_num = static_cast < int > ( tmp_part_num ) ;
2020-08-22 12:40:53 +00:00
if ( s3fscurl - > retry_count > = S3fsCurl : : retries ) {
S3FS_PRN_ERR ( " Over retry count(%d) limit(%s:%d). " , s3fscurl - > retry_count , s3fscurl - > path . c_str ( ) , part_num ) ;
return NULL ;
}
// duplicate request
S3fsCurl * newcurl = new S3fsCurl ( s3fscurl - > IsUseAhbe ( ) ) ;
2020-11-20 21:56:05 +00:00
newcurl - > partdata . petag = s3fscurl - > partdata . petag ;
2020-08-22 12:40:53 +00:00
newcurl - > partdata . fd = s3fscurl - > partdata . fd ;
newcurl - > partdata . startpos = s3fscurl - > b_partdata_startpos ;
newcurl - > partdata . size = s3fscurl - > b_partdata_size ;
newcurl - > b_partdata_startpos = s3fscurl - > b_partdata_startpos ;
newcurl - > b_partdata_size = s3fscurl - > b_partdata_size ;
newcurl - > retry_count = s3fscurl - > retry_count + 1 ;
newcurl - > op = s3fscurl - > op ;
newcurl - > type = s3fscurl - > type ;
// setup new curl object
if ( 0 ! = newcurl - > UploadMultipartPostSetup ( s3fscurl - > path . c_str ( ) , part_num , upload_id ) ) {
S3FS_PRN_ERR ( " Could not duplicate curl object(%s:%d). " , s3fscurl - > path . c_str ( ) , part_num ) ;
delete newcurl ;
return NULL ;
}
return newcurl ;
2013-07-10 06:24:06 +00:00
}
2019-01-29 07:39:11 +00:00
S3fsCurl * S3fsCurl : : CopyMultipartPostRetryCallback ( S3fsCurl * s3fscurl )
{
2020-08-22 12:40:53 +00:00
if ( ! s3fscurl ) {
return NULL ;
}
// parse and get part_num, upload_id.
2020-09-11 09:37:24 +00:00
std : : string upload_id ;
std : : string part_num_str ;
2021-06-13 03:50:07 +00:00
int part_num ;
2020-08-22 12:40:53 +00:00
off_t tmp_part_num = 0 ;
if ( ! get_keyword_value ( s3fscurl - > url , " uploadId " , upload_id ) ) {
return NULL ;
}
if ( ! get_keyword_value ( s3fscurl - > url , " partNumber " , part_num_str ) ) {
return NULL ;
}
2020-10-02 01:23:56 +00:00
if ( ! s3fs_strtoofft ( & tmp_part_num , part_num_str . c_str ( ) , /*base=*/ 10 ) ) {
2020-08-22 12:40:53 +00:00
return NULL ;
}
2021-06-13 03:50:07 +00:00
part_num = static_cast < int > ( tmp_part_num ) ;
2020-08-22 12:40:53 +00:00
if ( s3fscurl - > retry_count > = S3fsCurl : : retries ) {
S3FS_PRN_ERR ( " Over retry count(%d) limit(%s:%d). " , s3fscurl - > retry_count , s3fscurl - > path . c_str ( ) , part_num ) ;
return NULL ;
}
// duplicate request
S3fsCurl * newcurl = new S3fsCurl ( s3fscurl - > IsUseAhbe ( ) ) ;
2020-11-20 21:56:05 +00:00
newcurl - > partdata . petag = s3fscurl - > partdata . petag ;
2020-08-22 12:40:53 +00:00
newcurl - > b_from = s3fscurl - > b_from ;
newcurl - > b_meta = s3fscurl - > b_meta ;
newcurl - > retry_count = s3fscurl - > retry_count + 1 ;
newcurl - > op = s3fscurl - > op ;
newcurl - > type = s3fscurl - > type ;
// setup new curl object
if ( 0 ! = newcurl - > CopyMultipartPostSetup ( s3fscurl - > b_from . c_str ( ) , s3fscurl - > path . c_str ( ) , part_num , upload_id , s3fscurl - > b_meta ) ) {
S3FS_PRN_ERR ( " Could not duplicate curl object(%s:%d). " , s3fscurl - > path . c_str ( ) , part_num ) ;
delete newcurl ;
return NULL ;
}
return newcurl ;
2019-01-29 07:39:11 +00:00
}
2019-09-26 02:30:58 +00:00
S3fsCurl * S3fsCurl : : MixMultipartPostRetryCallback ( S3fsCurl * s3fscurl )
{
2020-08-22 12:40:53 +00:00
if ( ! s3fscurl ) {
return NULL ;
}
S3fsCurl * pcurl ;
if ( - 1 = = s3fscurl - > partdata . fd ) {
pcurl = S3fsCurl : : CopyMultipartPostRetryCallback ( s3fscurl ) ;
} else {
pcurl = S3fsCurl : : UploadMultipartPostRetryCallback ( s3fscurl ) ;
}
return pcurl ;
2019-09-26 02:30:58 +00:00
}
2021-04-25 04:18:11 +00:00
int S3fsCurl : : MapPutErrorResponse ( int result )
{
if ( result ! = 0 ) {
return result ;
}
// PUT returns 200 status code with something error, thus
// we need to check body.
//
// example error body:
// <?xml version="1.0" encoding="UTF-8"?>
// <Error>
// <Code>AccessDenied</Code>
// <Message>Access Denied</Message>
// <RequestId>E4CA6F6767D6685C</RequestId>
// <HostId>BHzLOATeDuvN8Es1wI8IcERq4kl4dc2A9tOB8Yqr39Ys6fl7N4EJ8sjGiVvu6wLP</HostId>
// </Error>
//
const char * pstrbody = bodydata . str ( ) ;
if ( ! pstrbody | | NULL ! = strcasestr ( pstrbody , " <Error> " ) ) {
S3FS_PRN_ERR ( " Put request get 200 status response, but it included error body(or NULL). The request failed during copying the object in S3. " ) ;
S3FS_PRN_DBG ( " Put request Response Body : %s " , ( pstrbody ? pstrbody : " (null) " ) ) ;
// TODO: parse more specific error from <Code>
result = - EIO ;
}
return result ;
}
2014-08-26 17:11:10 +00:00
int S3fsCurl : : ParallelMultipartUploadRequest ( const char * tpath , headers_t & meta , int fd )
2013-07-10 06:24:06 +00:00
{
2020-08-22 12:40:53 +00:00
int result ;
2020-09-11 09:37:24 +00:00
std : : string upload_id ;
2020-08-22 12:40:53 +00:00
struct stat st ;
int fd2 ;
etaglist_t list ;
off_t remaining_bytes ;
S3fsCurl s3fscurl ( true ) ;
2013-07-10 06:24:06 +00:00
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s][fd=%d] " , SAFESTRPTR ( tpath ) , fd ) ;
2018-11-15 00:48:57 +00:00
2020-08-22 12:40:53 +00:00
// duplicate fd
if ( - 1 = = ( fd2 = dup ( fd ) ) | | 0 ! = lseek ( fd2 , 0 , SEEK_SET ) ) {
S3FS_PRN_ERR ( " Could not duplicate file descriptor(errno=%d) " , errno ) ;
if ( - 1 ! = fd2 ) {
close ( fd2 ) ;
}
return - errno ;
}
if ( - 1 = = fstat ( fd2 , & st ) ) {
S3FS_PRN_ERR ( " Invalid file descriptor(errno=%d) " , errno ) ;
close ( fd2 ) ;
return - errno ;
}
2018-11-15 00:48:57 +00:00
2020-08-22 12:40:53 +00:00
if ( 0 ! = ( result = s3fscurl . PreMultipartPostRequest ( tpath , meta , upload_id , false ) ) ) {
close ( fd2 ) ;
return result ;
2013-07-10 06:24:06 +00:00
}
2020-08-22 12:40:53 +00:00
s3fscurl . DestroyCurlHandle ( ) ;
2013-07-10 06:24:06 +00:00
2020-08-22 12:40:53 +00:00
// Initialize S3fsMultiCurl
S3fsMultiCurl curlmulti ( GetMaxParallelCount ( ) ) ;
curlmulti . SetSuccessCallback ( S3fsCurl : : UploadMultipartPostCallback ) ;
curlmulti . SetRetryCallback ( S3fsCurl : : UploadMultipartPostRetryCallback ) ;
// cycle through open fd, pulling off 10MB chunks at a time
for ( remaining_bytes = st . st_size ; 0 < remaining_bytes ; ) {
off_t chunk = remaining_bytes > S3fsCurl : : multipart_size ? S3fsCurl : : multipart_size : remaining_bytes ;
// s3fscurl sub object
S3fsCurl * s3fscurl_para = new S3fsCurl ( true ) ;
s3fscurl_para - > partdata . fd = fd2 ;
s3fscurl_para - > partdata . startpos = st . st_size - remaining_bytes ;
s3fscurl_para - > partdata . size = chunk ;
s3fscurl_para - > b_partdata_startpos = s3fscurl_para - > partdata . startpos ;
s3fscurl_para - > b_partdata_size = s3fscurl_para - > partdata . size ;
2021-08-15 14:34:21 +00:00
s3fscurl_para - > partdata . add_etag_list ( list ) ;
2020-08-22 12:40:53 +00:00
// initiate upload part for parallel
2021-08-15 14:34:21 +00:00
if ( 0 ! = ( result = s3fscurl_para - > UploadMultipartPostSetup ( tpath , s3fscurl_para - > partdata . get_part_number ( ) , upload_id ) ) ) {
2020-08-22 12:40:53 +00:00
S3FS_PRN_ERR ( " failed uploading part setup(%d) " , result ) ;
close ( fd2 ) ;
delete s3fscurl_para ;
return result ;
}
// set into parallel object
if ( ! curlmulti . SetS3fsCurlObject ( s3fscurl_para ) ) {
S3FS_PRN_ERR ( " Could not make curl object into multi curl(%s). " , tpath ) ;
close ( fd2 ) ;
delete s3fscurl_para ;
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
remaining_bytes - = chunk ;
2013-07-10 06:24:06 +00:00
}
2020-08-22 12:40:53 +00:00
// Multi request
if ( 0 ! = ( result = curlmulti . Request ( ) ) ) {
S3FS_PRN_ERR ( " error occurred in multi request(errno=%d). " , result ) ;
2018-11-15 00:48:57 +00:00
2020-08-22 12:40:53 +00:00
S3fsCurl s3fscurl_abort ( true ) ;
int result2 = s3fscurl_abort . AbortMultipartUpload ( tpath , upload_id ) ;
s3fscurl_abort . DestroyCurlHandle ( ) ;
if ( result2 ! = 0 ) {
S3FS_PRN_ERR ( " error aborting multipart upload(errno=%d). " , result2 ) ;
}
2019-02-01 18:17:39 +00:00
2020-08-22 12:40:53 +00:00
return result ;
2019-02-01 18:17:39 +00:00
}
2020-08-22 12:40:53 +00:00
close ( fd2 ) ;
2013-07-10 06:24:06 +00:00
2020-08-22 12:40:53 +00:00
if ( 0 ! = ( result = s3fscurl . CompleteMultipartPostRequest ( tpath , upload_id , list ) ) ) {
return result ;
}
return 0 ;
2013-07-10 06:24:06 +00:00
}
2020-06-21 18:04:49 +00:00
int S3fsCurl : : ParallelMixMultipartUploadRequest ( const char * tpath , headers_t & meta , int fd , const fdpage_list_t & mixuppages )
2019-09-26 02:30:58 +00:00
{
2020-08-22 12:40:53 +00:00
int result ;
2020-09-11 09:37:24 +00:00
std : : string upload_id ;
2020-08-22 12:40:53 +00:00
struct stat st ;
int fd2 ;
etaglist_t list ;
S3fsCurl s3fscurl ( true ) ;
2019-09-26 02:30:58 +00:00
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s][fd=%d] " , SAFESTRPTR ( tpath ) , fd ) ;
// duplicate fd
if ( - 1 = = ( fd2 = dup ( fd ) ) | | 0 ! = lseek ( fd2 , 0 , SEEK_SET ) ) {
S3FS_PRN_ERR ( " Could not duplicate file descriptor(errno=%d) " , errno ) ;
if ( - 1 ! = fd2 ) {
close ( fd2 ) ;
}
return - errno ;
}
if ( - 1 = = fstat ( fd2 , & st ) ) {
S3FS_PRN_ERR ( " Invalid file descriptor(errno=%d) " , errno ) ;
2019-09-26 02:30:58 +00:00
close ( fd2 ) ;
2020-08-22 12:40:53 +00:00
return - errno ;
}
if ( 0 ! = ( result = s3fscurl . PreMultipartPostRequest ( tpath , meta , upload_id , true ) ) ) {
2019-09-26 02:30:58 +00:00
close ( fd2 ) ;
return result ;
}
2020-08-22 12:40:53 +00:00
s3fscurl . DestroyCurlHandle ( ) ;
// for copy multipart
2020-09-11 09:37:24 +00:00
std : : string srcresource ;
std : : string srcurl ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( get_realpath ( tpath ) . c_str ( ) , srcresource , srcurl ) ;
2020-09-11 09:37:24 +00:00
meta [ " Content-Type " ] = S3fsCurl : : LookupMimeType ( std : : string ( tpath ) ) ;
2020-08-22 12:40:53 +00:00
meta [ " x-amz-copy-source " ] = srcresource ;
// Initialize S3fsMultiCurl
S3fsMultiCurl curlmulti ( GetMaxParallelCount ( ) ) ;
curlmulti . SetSuccessCallback ( S3fsCurl : : MixMultipartPostCallback ) ;
curlmulti . SetRetryCallback ( S3fsCurl : : MixMultipartPostRetryCallback ) ;
for ( fdpage_list_t : : const_iterator iter = mixuppages . begin ( ) ; iter ! = mixuppages . end ( ) ; + + iter ) {
if ( iter - > modified ) {
// Multipart upload
2021-02-07 14:10:07 +00:00
S3fsCurl * s3fscurl_para = new S3fsCurl ( true ) ;
2020-08-22 12:40:53 +00:00
s3fscurl_para - > partdata . fd = fd2 ;
s3fscurl_para - > partdata . startpos = iter - > offset ;
s3fscurl_para - > partdata . size = iter - > bytes ;
s3fscurl_para - > b_partdata_startpos = s3fscurl_para - > partdata . startpos ;
s3fscurl_para - > b_partdata_size = s3fscurl_para - > partdata . size ;
2021-08-15 14:34:21 +00:00
s3fscurl_para - > partdata . add_etag_list ( list ) ;
2020-08-22 12:40:53 +00:00
2021-08-15 14:34:21 +00:00
S3FS_PRN_INFO3 ( " Upload Part [tpath=%s][start=%lld][size=%lld][part=%d] " , SAFESTRPTR ( tpath ) , static_cast < long long > ( iter - > offset ) , static_cast < long long > ( iter - > bytes ) , s3fscurl_para - > partdata . get_part_number ( ) ) ;
2020-08-22 12:40:53 +00:00
// initiate upload part for parallel
2021-08-15 14:34:21 +00:00
if ( 0 ! = ( result = s3fscurl_para - > UploadMultipartPostSetup ( tpath , s3fscurl_para - > partdata . get_part_number ( ) , upload_id ) ) ) {
2020-08-22 12:40:53 +00:00
S3FS_PRN_ERR ( " failed uploading part setup(%d) " , result ) ;
close ( fd2 ) ;
delete s3fscurl_para ;
return result ;
}
2021-02-07 14:10:07 +00:00
// set into parallel object
if ( ! curlmulti . SetS3fsCurlObject ( s3fscurl_para ) ) {
S3FS_PRN_ERR ( " Could not make curl object into multi curl(%s). " , tpath ) ;
2020-08-22 12:40:53 +00:00
close ( fd2 ) ;
delete s3fscurl_para ;
2021-02-07 14:10:07 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2021-02-07 14:10:07 +00:00
} else {
// Multipart copy
2021-11-27 07:53:26 +00:00
for ( off_t i = 0 , bytes = 0 ; i < iter - > bytes ; i + = bytes ) {
2021-02-07 14:10:07 +00:00
S3fsCurl * s3fscurl_para = new S3fsCurl ( true ) ;
2021-11-27 07:53:26 +00:00
bytes = std : : min ( static_cast < off_t > ( GetMultipartCopySize ( ) ) , iter - > bytes - i ) ;
/* every part should be larger than MIN_MULTIPART_SIZE and smaller than FIVE_GB */
off_t remain_bytes = iter - > bytes - i - bytes ;
if ( ( MIN_MULTIPART_SIZE > remain_bytes ) & & ( 0 < remain_bytes ) ) {
if ( FIVE_GB < ( bytes + remain_bytes ) ) {
bytes = ( bytes + remain_bytes ) / 2 ;
} else {
bytes + = remain_bytes ;
}
}
2021-02-07 14:10:07 +00:00
std : : ostringstream strrange ;
strrange < < " bytes= " < < ( iter - > offset + i ) < < " - " < < ( iter - > offset + i + bytes - 1 ) ;
meta [ " x-amz-copy-source-range " ] = strrange . str ( ) ;
s3fscurl_para - > b_from = SAFESTRPTR ( tpath ) ;
s3fscurl_para - > b_meta = meta ;
2021-08-15 14:34:21 +00:00
s3fscurl_para - > partdata . add_etag_list ( list ) ;
2021-02-07 14:10:07 +00:00
2021-08-15 14:34:21 +00:00
S3FS_PRN_INFO3 ( " Copy Part [tpath=%s][start=%lld][size=%lld][part=%d] " , SAFESTRPTR ( tpath ) , static_cast < long long > ( iter - > offset + i ) , static_cast < long long > ( bytes ) , s3fscurl_para - > partdata . get_part_number ( ) ) ;
2021-02-07 14:10:07 +00:00
// initiate upload part for parallel
2021-08-15 14:34:21 +00:00
if ( 0 ! = ( result = s3fscurl_para - > CopyMultipartPostSetup ( tpath , tpath , s3fscurl_para - > partdata . get_part_number ( ) , upload_id , meta ) ) ) {
2021-02-07 14:10:07 +00:00
S3FS_PRN_ERR ( " failed uploading part setup(%d) " , result ) ;
close ( fd2 ) ;
delete s3fscurl_para ;
return result ;
}
2019-09-26 02:30:58 +00:00
2021-02-07 14:10:07 +00:00
// set into parallel object
if ( ! curlmulti . SetS3fsCurlObject ( s3fscurl_para ) ) {
S3FS_PRN_ERR ( " Could not make curl object into multi curl(%s). " , tpath ) ;
close ( fd2 ) ;
delete s3fscurl_para ;
return - EIO ;
}
}
2020-08-22 12:40:53 +00:00
}
2019-09-26 02:30:58 +00:00
}
2020-08-22 12:40:53 +00:00
// Multi request
if ( 0 ! = ( result = curlmulti . Request ( ) ) ) {
S3FS_PRN_ERR ( " error occurred in multi request(errno=%d). " , result ) ;
2019-09-26 02:30:58 +00:00
2020-08-22 12:40:53 +00:00
S3fsCurl s3fscurl_abort ( true ) ;
int result2 = s3fscurl_abort . AbortMultipartUpload ( tpath , upload_id ) ;
s3fscurl_abort . DestroyCurlHandle ( ) ;
if ( result2 ! = 0 ) {
S3FS_PRN_ERR ( " error aborting multipart upload(errno=%d). " , result2 ) ;
}
close ( fd2 ) ;
return result ;
2019-09-26 02:30:58 +00:00
}
close ( fd2 ) ;
2020-08-22 12:40:53 +00:00
if ( 0 ! = ( result = s3fscurl . CompleteMultipartPostRequest ( tpath , upload_id , list ) ) ) {
return result ;
}
return 0 ;
2019-09-26 02:30:58 +00:00
}
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
S3fsCurl * S3fsCurl : : ParallelGetObjectRetryCallback ( S3fsCurl * s3fscurl )
{
2020-08-22 12:40:53 +00:00
int result ;
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
2020-08-22 12:40:53 +00:00
if ( ! s3fscurl ) {
return NULL ;
}
if ( s3fscurl - > retry_count > = S3fsCurl : : retries ) {
S3FS_PRN_ERR ( " Over retry count(%d) limit(%s). " , s3fscurl - > retry_count , s3fscurl - > path . c_str ( ) ) ;
return NULL ;
}
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
2020-08-22 12:40:53 +00:00
// duplicate request(setup new curl object)
S3fsCurl * newcurl = new S3fsCurl ( s3fscurl - > IsUseAhbe ( ) ) ;
2021-01-04 13:57:56 +00:00
2020-08-22 12:40:53 +00:00
if ( 0 ! = ( result = newcurl - > PreGetObjectRequest ( s3fscurl - > path . c_str ( ) , s3fscurl - > partdata . fd , s3fscurl - > partdata . startpos , s3fscurl - > partdata . size , s3fscurl - > b_ssetype , s3fscurl - > b_ssevalue ) ) ) {
S3FS_PRN_ERR ( " failed downloading part setup(%d) " , result ) ;
delete newcurl ;
return NULL ; ;
}
newcurl - > retry_count = s3fscurl - > retry_count + 1 ;
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
2020-08-22 12:40:53 +00:00
return newcurl ;
}
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
2020-09-14 09:09:25 +00:00
int S3fsCurl : : ParallelGetObjectRequest ( const char * tpath , int fd , off_t start , off_t size )
2020-08-22 12:40:53 +00:00
{
S3FS_PRN_INFO3 ( " [tpath=%s][fd=%d] " , SAFESTRPTR ( tpath ) , fd ) ;
sse_type_t ssetype = sse_type_t : : SSE_DISABLE ;
2020-09-11 09:37:24 +00:00
std : : string ssevalue ;
2020-08-22 12:40:53 +00:00
if ( ! get_object_sse_type ( tpath , ssetype , ssevalue ) ) {
S3FS_PRN_WARN ( " Failed to get SSE type for file(%s). " , SAFESTRPTR ( tpath ) ) ;
}
int result = 0 ;
2020-09-14 09:09:25 +00:00
off_t remaining_bytes ;
2020-08-22 12:40:53 +00:00
// cycle through open fd, pulling off 10MB chunks at a time
for ( remaining_bytes = size ; 0 < remaining_bytes ; ) {
S3fsMultiCurl curlmulti ( GetMaxParallelCount ( ) ) ;
int para_cnt ;
off_t chunk ;
// Initialize S3fsMultiCurl
//curlmulti.SetSuccessCallback(NULL); // not need to set success callback
curlmulti . SetRetryCallback ( S3fsCurl : : ParallelGetObjectRetryCallback ) ;
// Loop for setup parallel upload(multipart) request.
for ( para_cnt = 0 ; para_cnt < S3fsCurl : : max_parallel_cnt & & 0 < remaining_bytes ; para_cnt + + , remaining_bytes - = chunk ) {
// chunk size
chunk = remaining_bytes > S3fsCurl : : multipart_size ? S3fsCurl : : multipart_size : remaining_bytes ;
// s3fscurl sub object
S3fsCurl * s3fscurl_para = new S3fsCurl ( ) ;
if ( 0 ! = ( result = s3fscurl_para - > PreGetObjectRequest ( tpath , fd , ( start + size - remaining_bytes ) , chunk , ssetype , ssevalue ) ) ) {
S3FS_PRN_ERR ( " failed downloading part setup(%d) " , result ) ;
delete s3fscurl_para ;
return result ;
}
// set into parallel object
if ( ! curlmulti . SetS3fsCurlObject ( s3fscurl_para ) ) {
S3FS_PRN_ERR ( " Could not make curl object into multi curl(%s). " , tpath ) ;
delete s3fscurl_para ;
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
}
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
2020-08-22 12:40:53 +00:00
// Multi request
if ( 0 ! = ( result = curlmulti . Request ( ) ) ) {
S3FS_PRN_ERR ( " error occurred in multi request(errno=%d). " , result ) ;
break ;
}
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
2020-08-22 12:40:53 +00:00
// reinit for loop.
curlmulti . Clear ( ) ;
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
}
2020-08-22 12:40:53 +00:00
return result ;
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
}
2019-02-25 12:47:10 +00:00
bool S3fsCurl : : UploadMultipartPostSetCurlOpts ( S3fsCurl * s3fscurl )
{
2020-08-22 12:40:53 +00:00
if ( ! s3fscurl ) {
return false ;
}
if ( ! s3fscurl - > CreateCurlHandle ( ) ) {
return false ;
}
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_URL , s3fscurl - > url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_UPLOAD , true ) ) { // HTTP PUT
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_WRITEDATA , ( void * ) ( & s3fscurl - > bodydata ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_HEADERDATA , ( void * ) & ( s3fscurl - > responseHeaders ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_HEADERFUNCTION , HeaderCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_INFILESIZE_LARGE , static_cast < curl_off_t > ( s3fscurl - > partdata . size ) ) ) { // Content-Length
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_READFUNCTION , UploadReadCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_READDATA , ( void * ) s3fscurl ) ) {
return false ;
}
if ( ! S3fsCurl : : AddUserAgent ( s3fscurl - > hCurl ) ) { // put User-Agent
return false ;
}
2020-08-22 12:40:53 +00:00
return true ;
2019-02-25 12:47:10 +00:00
}
bool S3fsCurl : : CopyMultipartPostSetCurlOpts ( S3fsCurl * s3fscurl )
{
2020-08-22 12:40:53 +00:00
if ( ! s3fscurl ) {
return false ;
}
if ( ! s3fscurl - > CreateCurlHandle ( ) ) {
return false ;
}
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_URL , s3fscurl - > url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_UPLOAD , true ) ) { // HTTP PUT
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_WRITEDATA , ( void * ) ( & s3fscurl - > bodydata ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_HEADERDATA , ( void * ) ( & s3fscurl - > headdata ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_HEADERFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_INFILESIZE , 0 ) ) { // Content-Length
return false ;
}
if ( ! S3fsCurl : : AddUserAgent ( s3fscurl - > hCurl ) ) { // put User-Agent
return false ;
}
2020-08-22 12:40:53 +00:00
return true ;
2019-02-25 12:47:10 +00:00
}
bool S3fsCurl : : PreGetObjectRequestSetCurlOpts ( S3fsCurl * s3fscurl )
{
2020-08-22 12:40:53 +00:00
if ( ! s3fscurl ) {
return false ;
}
if ( ! s3fscurl - > CreateCurlHandle ( ) ) {
return false ;
}
2019-02-25 12:47:10 +00:00
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_URL , s3fscurl - > url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_WRITEFUNCTION , DownloadWriteCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_WRITEDATA , ( void * ) s3fscurl ) ) {
return false ;
}
if ( ! S3fsCurl : : AddUserAgent ( s3fscurl - > hCurl ) ) { // put User-Agent
return false ;
}
2019-02-25 12:47:10 +00:00
2020-08-22 12:40:53 +00:00
return true ;
2019-02-25 12:47:10 +00:00
}
bool S3fsCurl : : PreHeadRequestSetCurlOpts ( S3fsCurl * s3fscurl )
{
2020-08-22 12:40:53 +00:00
if ( ! s3fscurl ) {
return false ;
}
if ( ! s3fscurl - > CreateCurlHandle ( ) ) {
return false ;
}
2019-02-25 12:47:10 +00:00
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_URL , s3fscurl - > url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_NOBODY , true ) ) { // HEAD
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_FILETIME , true ) ) { // Last-Modified
return false ;
}
2019-02-25 12:47:10 +00:00
2020-08-22 12:40:53 +00:00
// responseHeaders
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_HEADERDATA , ( void * ) & ( s3fscurl - > responseHeaders ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( s3fscurl - > hCurl , CURLOPT_HEADERFUNCTION , HeaderCallback ) ) {
return false ;
}
if ( ! S3fsCurl : : AddUserAgent ( s3fscurl - > hCurl ) ) { // put User-Agent
return false ;
}
2019-02-25 12:47:10 +00:00
2020-08-22 12:40:53 +00:00
return true ;
2019-02-25 12:47:10 +00:00
}
2013-10-06 13:45:32 +00:00
bool S3fsCurl : : ParseIAMCredentialResponse ( const char * response , iamcredmap_t & keyval )
{
2020-08-22 12:40:53 +00:00
if ( ! response ) {
return false ;
}
2020-09-11 09:37:24 +00:00
std : : istringstream sscred ( response ) ;
std : : string oneline ;
2020-08-22 12:40:53 +00:00
keyval . clear ( ) ;
while ( getline ( sscred , oneline , ' , ' ) ) {
2020-09-11 09:37:24 +00:00
std : : string : : size_type pos ;
std : : string key ;
std : : string val ;
if ( std : : string : : npos ! = ( pos = oneline . find ( IAMCRED_ACCESSKEYID ) ) ) {
2020-08-22 12:40:53 +00:00
key = IAMCRED_ACCESSKEYID ;
2020-09-11 09:37:24 +00:00
} else if ( std : : string : : npos ! = ( pos = oneline . find ( IAMCRED_SECRETACCESSKEY ) ) ) {
2020-08-22 12:40:53 +00:00
key = IAMCRED_SECRETACCESSKEY ;
2020-09-11 09:37:24 +00:00
} else if ( std : : string : : npos ! = ( pos = oneline . find ( S3fsCurl : : IAM_token_field ) ) ) {
2020-08-22 12:40:53 +00:00
key = S3fsCurl : : IAM_token_field ;
2020-09-11 09:37:24 +00:00
} else if ( std : : string : : npos ! = ( pos = oneline . find ( S3fsCurl : : IAM_expiry_field ) ) ) {
2020-08-22 12:40:53 +00:00
key = S3fsCurl : : IAM_expiry_field ;
2020-09-11 09:37:24 +00:00
} else if ( std : : string : : npos ! = ( pos = oneline . find ( IAMCRED_ROLEARN ) ) ) {
2020-08-22 12:40:53 +00:00
key = IAMCRED_ROLEARN ;
} else {
continue ;
}
2020-09-11 09:37:24 +00:00
if ( std : : string : : npos = = ( pos = oneline . find ( ' : ' , pos + key . length ( ) ) ) ) {
2020-08-22 12:40:53 +00:00
continue ;
}
if ( S3fsCurl : : is_ibm_iam_auth & & key = = S3fsCurl : : IAM_expiry_field ) {
// parse integer value
2020-09-11 09:37:24 +00:00
if ( std : : string : : npos = = ( pos = oneline . find_first_of ( " 0123456789 " , pos ) ) ) {
2020-08-22 12:40:53 +00:00
continue ;
}
2021-01-25 09:02:32 +00:00
oneline . erase ( 0 , pos ) ;
2020-09-11 09:37:24 +00:00
if ( std : : string : : npos = = ( pos = oneline . find_last_of ( " 0123456789 " ) ) ) {
2020-08-22 12:40:53 +00:00
continue ;
}
val = oneline . substr ( 0 , pos + 1 ) ;
} else {
2020-09-11 09:37:24 +00:00
// parse std::string value (starts and ends with quotes)
if ( std : : string : : npos = = ( pos = oneline . find ( ' \" ' , pos ) ) ) {
2020-08-22 12:40:53 +00:00
continue ;
}
2021-01-25 09:02:32 +00:00
oneline . erase ( 0 , pos + 1 ) ;
2020-09-11 09:37:24 +00:00
if ( std : : string : : npos = = ( pos = oneline . find ( ' \" ' ) ) ) {
2020-08-22 12:40:53 +00:00
continue ;
}
val = oneline . substr ( 0 , pos ) ;
}
keyval [ key ] = val ;
}
return true ;
}
2013-10-06 13:45:32 +00:00
2020-10-15 17:18:19 +00:00
bool S3fsCurl : : SetIAMv2APIToken ( const char * response )
{
S3FS_PRN_INFO3 ( " Setting AWS IMDSv2 API token to %s " , response ) ;
S3fsCurl : : IAMv2_api_token = std : : string ( response ) ;
2020-10-30 16:59:55 +00:00
return true ;
2020-10-15 17:18:19 +00:00
}
2013-10-06 13:45:32 +00:00
bool S3fsCurl : : SetIAMCredentials ( const char * response )
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " IAM credential response = \" %s \" " , response ) ;
2013-10-06 13:45:32 +00:00
2020-08-22 12:40:53 +00:00
iamcredmap_t keyval ;
2013-10-06 13:45:32 +00:00
2020-08-22 12:40:53 +00:00
if ( ! ParseIAMCredentialResponse ( response , keyval ) ) {
return false ;
}
2017-11-06 21:45:58 +00:00
2020-08-22 12:40:53 +00:00
if ( S3fsCurl : : IAM_field_count ! = keyval . size ( ) ) {
return false ;
}
2013-10-06 13:45:32 +00:00
2020-09-11 09:37:24 +00:00
S3fsCurl : : AWSAccessToken = keyval [ std : : string ( S3fsCurl : : IAM_token_field ) ] ;
2017-11-23 08:46:24 +00:00
2020-08-22 12:40:53 +00:00
if ( S3fsCurl : : is_ibm_iam_auth ) {
off_t tmp_expire = 0 ;
2020-10-02 01:23:56 +00:00
if ( ! s3fs_strtoofft ( & tmp_expire , keyval [ std : : string ( S3fsCurl : : IAM_expiry_field ) ] . c_str ( ) , /*base=*/ 10 ) ) {
2020-08-22 12:40:53 +00:00
return false ;
}
S3fsCurl : : AWSAccessTokenExpire = static_cast < time_t > ( tmp_expire ) ;
} else {
2020-09-11 09:37:24 +00:00
S3fsCurl : : AWSAccessKeyId = keyval [ std : : string ( IAMCRED_ACCESSKEYID ) ] ;
S3fsCurl : : AWSSecretAccessKey = keyval [ std : : string ( IAMCRED_SECRETACCESSKEY ) ] ;
2020-08-22 12:40:53 +00:00
S3fsCurl : : AWSAccessTokenExpire = cvtIAMExpireStringToTime ( keyval [ S3fsCurl : : IAM_expiry_field ] . c_str ( ) ) ;
2020-05-03 08:08:28 +00:00
}
2020-08-22 12:40:53 +00:00
return true ;
2013-10-06 13:45:32 +00:00
}
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : CheckIAMCredentialUpdate ( )
2013-10-06 13:45:32 +00:00
{
2020-08-22 12:40:53 +00:00
if ( S3fsCurl : : IAM_role . empty ( ) & & ! S3fsCurl : : is_ecs & & ! S3fsCurl : : is_ibm_iam_auth ) {
return true ;
}
if ( time ( NULL ) + IAM_EXPIRE_MERGIN < = S3fsCurl : : AWSAccessTokenExpire ) {
return true ;
}
2020-09-25 07:48:46 +00:00
S3FS_PRN_INFO ( " IAM Access Token refreshing... " ) ;
2020-08-22 12:40:53 +00:00
// update
S3fsCurl s3fscurl ;
if ( 0 ! = s3fscurl . GetIAMCredentials ( ) ) {
2020-09-25 07:48:46 +00:00
S3FS_PRN_ERR ( " IAM Access Token refresh failed " ) ;
2020-08-22 12:40:53 +00:00
return false ;
}
2020-09-25 07:48:46 +00:00
S3FS_PRN_INFO ( " IAM Access Token refreshed " ) ;
2013-10-06 13:45:32 +00:00
return true ;
}
2020-09-11 09:37:24 +00:00
bool S3fsCurl : : ParseIAMRoleFromMetaDataResponse ( const char * response , std : : string & rolename )
2016-05-06 04:37:32 +00:00
{
2020-08-22 12:40:53 +00:00
if ( ! response ) {
return false ;
}
// [NOTE]
// expected following strings.
//
// myrolename
//
2020-09-11 09:37:24 +00:00
std : : istringstream ssrole ( response ) ;
std : : string oneline ;
2020-08-22 12:40:53 +00:00
if ( getline ( ssrole , oneline , ' \n ' ) ) {
rolename = oneline ;
return ! rolename . empty ( ) ;
}
2016-05-06 04:37:32 +00:00
return false ;
}
bool S3fsCurl : : SetIAMRoleFromMetaData ( const char * response )
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " IAM role name response = \" %s \" " , response ) ;
2016-05-06 04:37:32 +00:00
2020-09-11 09:37:24 +00:00
std : : string rolename ;
2016-05-06 04:37:32 +00:00
2020-08-22 12:40:53 +00:00
if ( ! S3fsCurl : : ParseIAMRoleFromMetaDataResponse ( response , rolename ) ) {
return false ;
}
2016-05-06 04:37:32 +00:00
2020-08-22 12:40:53 +00:00
SetIAMRole ( rolename . c_str ( ) ) ;
return true ;
2016-05-06 04:37:32 +00:00
}
2016-04-17 07:44:03 +00:00
bool S3fsCurl : : AddUserAgent ( CURL * hCurl )
{
2020-08-22 12:40:53 +00:00
if ( ! hCurl ) {
return false ;
}
if ( S3fsCurl : : IsUserAgentFlag ( ) ) {
curl_easy_setopt ( hCurl , CURLOPT_USERAGENT , S3fsCurl : : userAgent . c_str ( ) ) ;
}
return true ;
2016-04-17 07:44:03 +00:00
}
2021-10-07 13:19:36 +00:00
int S3fsCurl : : CurlDebugFunc ( const CURL * hcurl , curl_infotype type , char * data , size_t size , void * userptr )
2020-05-24 07:23:27 +00:00
{
2020-08-22 12:40:53 +00:00
return S3fsCurl : : RawCurlDebugFunc ( hcurl , type , data , size , userptr , CURLINFO_END ) ;
2020-05-24 07:23:27 +00:00
}
2021-10-07 13:19:36 +00:00
int S3fsCurl : : CurlDebugBodyInFunc ( const CURL * hcurl , curl_infotype type , char * data , size_t size , void * userptr )
2020-05-24 07:23:27 +00:00
{
2020-08-22 12:40:53 +00:00
return S3fsCurl : : RawCurlDebugFunc ( hcurl , type , data , size , userptr , CURLINFO_DATA_IN ) ;
2020-05-24 07:23:27 +00:00
}
2021-10-07 13:19:36 +00:00
int S3fsCurl : : CurlDebugBodyOutFunc ( const CURL * hcurl , curl_infotype type , char * data , size_t size , void * userptr )
2020-05-24 07:23:27 +00:00
{
2020-08-22 12:40:53 +00:00
return S3fsCurl : : RawCurlDebugFunc ( hcurl , type , data , size , userptr , CURLINFO_DATA_OUT ) ;
2020-05-24 07:23:27 +00:00
}
2021-10-07 13:19:36 +00:00
int S3fsCurl : : RawCurlDebugFunc ( const CURL * hcurl , curl_infotype type , char * data , size_t size , void * userptr , curl_infotype datatype )
2015-09-30 19:41:27 +00:00
{
2020-08-22 12:40:53 +00:00
if ( ! hcurl ) {
// something wrong...
return 0 ;
}
switch ( type ) {
case CURLINFO_TEXT :
// Swap tab indentation with spaces so it stays pretty in syslog
int indent ;
indent = 0 ;
while ( * data = = ' \t ' & & size > 0 ) {
indent + = 4 ;
size - - ;
data + + ;
}
if ( foreground & & 0 < size & & ' \n ' = = data [ size - 1 ] ) {
size - - ;
}
S3FS_PRN_CURL ( " * %*s%.*s " , indent , " " , ( int ) size , data ) ;
break ;
case CURLINFO_DATA_IN :
case CURLINFO_DATA_OUT :
if ( type ! = datatype | | ! S3fsCurl : : is_dump_body ) {
// not put
break ;
}
case CURLINFO_HEADER_IN :
case CURLINFO_HEADER_OUT :
size_t remaining ;
char * p ;
// Print each line individually for tidy output
remaining = size ;
p = data ;
do {
char * eol = ( char * ) memchr ( p , ' \n ' , remaining ) ;
int newline = 0 ;
if ( eol = = NULL ) {
eol = ( char * ) memchr ( p , ' \r ' , remaining ) ;
} else {
if ( eol > p & & * ( eol - 1 ) = = ' \r ' ) {
newline + + ;
}
newline + + ;
eol + + ;
}
size_t length = eol - p ;
S3FS_PRN_CURL ( " %s %.*s " , getCurlDebugHead ( type ) , ( int ) length - newline , p ) ;
remaining - = length ;
p = eol ;
} while ( p ! = NULL & & remaining > 0 ) ;
break ;
case CURLINFO_SSL_DATA_IN :
case CURLINFO_SSL_DATA_OUT :
// not put
break ;
default :
// why
break ;
}
2015-09-30 19:41:27 +00:00
return 0 ;
}
2013-07-05 02:28:31 +00:00
//-------------------------------------------------------------------
// Methods for S3fsCurl
//-------------------------------------------------------------------
Fixed Issue 229 and Changes codes
1) Set metadata "Content-Encoding" automatically(Issue 292)
For this issue, s3fs is added new option "ahbe_conf".
New option means the configuration file path, and this file specifies
additional HTTP header by file(object) extension.
Thus you can specify any HTTP header for each object by extension.
* ahbe_conf file format:
-----------
line = [file suffix] HTTP-header [HTTP-header-values]
file suffix = file(object) suffix, if this field is empty,
it means "*"(all object).
HTTP-header = additional HTTP header name
HTTP-header-values = additional HTTP header value
-----------
* Example:
-----------
.gz Content-Encoding gzip
.Z Content-Encoding compress
X-S3FS-MYHTTPHEAD myvalue
-----------
A sample configuration file is uploaded in "test" directory.
If ahbe_conf parameter is specified, s3fs loads it's configuration
and compares extension(suffix) of object(file) when uploading
(PUT/POST) it. If the extension is same, s3fs adds/sends specified
HTTP header and value.
A case of sample configuration file, if a object(it's extension is
".gz") which already has Content-Encoding HTTP header is renamed
to ".txt" extension, s3fs does not set Content-Encoding. Because
".txt" is not match any line in configuration file.
So, s3fs matches the extension by each PUT/POST action.
* Please take care about "Content-Encoding".
This new option allows setting ANY HTTP header by object extension.
For example, you can specify "Content-Encoding" for ".gz"/etc
extension in configuration. But this means that S3 always returns
"Content-Encoding: gzip" when a client requests with other
"Accept-Encoding:" header. It SHOULD NOT be good.
Please see RFC 2616.
2) Changes about allow_other/uid/gid option for mount point
I reviewed about mount point permission and allow_other/uid/gid
options, and found bugs about these.
s3fs is fixed bugs and changed to the following specifications.
* s3fs only allows uid(gid) options as 0(root), when the effective
user is zero(root).
* A mount point(directory) must have a permission to allow
accessing by effective user/group.
* If allow_other option is specified, the mount point permission
is set 0777(all users allow all access).
In another case, the mount point is set 0700(only allows
effective user).
* When uid/gid option is specified, the mount point owner/group
is set uid/gid option value.
If uid/gid is not set, it is set effective user/group id.
This changes maybe fixes some issue(321, 338).
3) Changes a logic about (Issue 229)
The chmod command returns -EIO when changing the mount point.
It is correct, s3fs can not changed owner/group/mtime for the
mount point, but s3fs sends a request for changing the bucket.
This revision does not send the request, and returns EIO as
soon as possible.
git-svn-id: http://s3fs.googlecode.com/svn/trunk@465 df820570-a93a-0410-bd06-b72b767a4274
2013-08-16 19:24:01 +00:00
S3fsCurl : : S3fsCurl ( bool ahbe ) :
2020-11-09 12:15:20 +00:00
hCurl ( NULL ) , type ( REQTYPE_UNSET ) , requestHeaders ( NULL ) ,
2019-08-22 15:22:57 +00:00
LastResponseCode ( S3FSCURL_RESPONSECODE_NOTSET ) , postdata ( NULL ) , postdata_remaining ( 0 ) , is_use_ahbe ( ahbe ) ,
2014-07-19 19:02:55 +00:00
retry_count ( 0 ) , b_infile ( NULL ) , b_postdata ( NULL ) , b_postdata_remaining ( 0 ) , b_partdata_startpos ( 0 ) , b_partdata_size ( 0 ) ,
2020-11-09 12:15:20 +00:00
b_ssekey_pos ( - 1 ) , b_ssetype ( sse_type_t : : SSE_DISABLE ) ,
2021-05-06 10:40:35 +00:00
sem ( NULL ) , completed_tids_lock ( NULL ) , completed_tids ( NULL ) , fpLazySetup ( NULL ) , curlCode ( CURLE_OK )
2013-07-05 02:28:31 +00:00
{
2011-09-01 19:24:12 +00:00
}
2013-07-05 02:28:31 +00:00
S3fsCurl : : ~ S3fsCurl ( )
2013-03-30 13:37:14 +00:00
{
2020-08-22 12:40:53 +00:00
DestroyCurlHandle ( ) ;
2013-07-05 02:28:31 +00:00
}
2013-03-30 13:37:14 +00:00
2020-08-17 00:10:49 +00:00
bool S3fsCurl : : ResetHandle ( bool lock_already_held )
2013-07-05 02:28:31 +00:00
{
2020-10-03 12:09:35 +00:00
bool run_once ;
{
AutoLock lock ( & S3fsCurl : : curl_warnings_lock ) ;
run_once = curl_warnings_once ;
curl_warnings_once = true ;
}
2020-08-22 12:40:53 +00:00
curl_easy_reset ( hCurl ) ;
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_NOSIGNAL , 1 ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_FOLLOWLOCATION , true ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_CONNECTTIMEOUT , S3fsCurl : : connect_timeout ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_NOPROGRESS , 0 ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_PROGRESSFUNCTION , S3fsCurl : : CurlProgress ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_PROGRESSDATA , hCurl ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
// curl_easy_setopt(hCurl, CURLOPT_FORBID_REUSE, 1);
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , S3FS_CURLOPT_TCP_KEEPALIVE , 1 ) & & ! run_once ) {
S3FS_PRN_WARN ( " The CURLOPT_TCP_KEEPALIVE option could not be set. For maximize performance you need to enable this option and you should use libcurl 7.25.0 or later. " ) ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , S3FS_CURLOPT_SSL_ENABLE_ALPN , 0 ) & & ! run_once ) {
S3FS_PRN_WARN ( " The CURLOPT_SSL_ENABLE_ALPN option could not be unset. S3 server does not support ALPN, then this option should be disabled to maximize performance. you need to use libcurl 7.36.0 or later. " ) ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , S3FS_CURLOPT_KEEP_SENDING_ON_ERROR , 1 ) & & ! run_once ) {
S3FS_PRN_WARN ( " The S3FS_CURLOPT_KEEP_SENDING_ON_ERROR option could not be set. For maximize performance you need to enable this option and you should use libcurl 7.51.0 or later. " ) ;
}
if ( type ! = REQTYPE_IAMCRED & & type ! = REQTYPE_IAMROLE ) {
// REQTYPE_IAMCRED and REQTYPE_IAMROLE are always HTTP
if ( 0 = = S3fsCurl : : ssl_verify_hostname ) {
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_SSL_VERIFYHOST , 0 ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
}
if ( ! S3fsCurl : : curl_ca_bundle . empty ( ) ) {
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_CAINFO , S3fsCurl : : curl_ca_bundle . c_str ( ) ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
}
}
if ( ( S3fsCurl : : is_dns_cache | | S3fsCurl : : is_ssl_session_cache ) & & S3fsCurl : : hCurlShare ) {
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_SHARE , S3fsCurl : : hCurlShare ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
}
if ( ! S3fsCurl : : is_cert_check ) {
S3FS_PRN_DBG ( " 'no_check_certificate' option in effect. " ) ;
S3FS_PRN_DBG ( " The server certificate won't be checked against the available certificate authorities. " ) ;
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_SSL_VERIFYPEER , false ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
}
if ( S3fsCurl : : is_verbose ) {
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_VERBOSE , true ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_DEBUGFUNCTION , S3fsCurl : : CurlDebugFunc ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
}
if ( ! cipher_suites . empty ( ) ) {
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_SSL_CIPHER_LIST , cipher_suites . c_str ( ) ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
}
AutoLock lock ( & S3fsCurl : : curl_handles_lock , lock_already_held ? AutoLock : : ALREADY_LOCKED : AutoLock : : NONE ) ;
S3fsCurl : : curl_times [ hCurl ] = time ( 0 ) ;
S3fsCurl : : curl_progress [ hCurl ] = progress_t ( - 1 , - 1 ) ;
return true ;
2013-08-21 07:43:32 +00:00
}
2019-02-25 12:47:10 +00:00
bool S3fsCurl : : CreateCurlHandle ( bool only_pool , bool remake )
2013-08-21 07:43:32 +00:00
{
2020-08-22 12:40:53 +00:00
AutoLock lock ( & S3fsCurl : : curl_handles_lock ) ;
2013-08-21 07:43:32 +00:00
2020-08-22 12:40:53 +00:00
if ( hCurl & & remake ) {
if ( ! DestroyCurlHandle ( false ) ) {
S3FS_PRN_ERR ( " could not destroy handle. " ) ;
return false ;
}
S3FS_PRN_INFO3 ( " already has handle, so destroyed it or restored it to pool. " ) ;
2013-08-21 07:43:32 +00:00
}
2020-08-22 12:40:53 +00:00
if ( ! hCurl ) {
if ( NULL = = ( hCurl = sCurlPool - > GetHandler ( only_pool ) ) ) {
if ( ! only_pool ) {
S3FS_PRN_ERR ( " Failed to create handle. " ) ;
return false ;
} else {
// [NOTE]
// Further initialization processing is left to lazy processing to be executed later.
// (Currently we do not use only_pool=true, but this code is remained for the future)
return true ;
}
}
2019-02-25 12:47:10 +00:00
}
2020-08-22 12:40:53 +00:00
ResetHandle ( /*lock_already_held=*/ true ) ;
2016-01-24 05:01:50 +00:00
2020-08-22 12:40:53 +00:00
return true ;
2019-07-02 18:29:38 +00:00
}
bool S3fsCurl : : DestroyCurlHandle ( bool restore_pool , bool clear_internal_data )
{
2020-08-22 12:40:53 +00:00
// [NOTE]
// If type is REQTYPE_IAMCRED or REQTYPE_IAMROLE, do not clear type.
// Because that type only uses HTTP protocol, then the special
// logic in ResetHandle function.
//
if ( type ! = REQTYPE_IAMCRED & & type ! = REQTYPE_IAMROLE ) {
type = REQTYPE_UNSET ;
}
2018-05-27 10:48:03 +00:00
2020-08-22 12:40:53 +00:00
if ( clear_internal_data ) {
ClearInternalData ( ) ;
}
if ( hCurl ) {
AutoLock lock ( & S3fsCurl : : curl_handles_lock ) ;
S3fsCurl : : curl_times . erase ( hCurl ) ;
S3fsCurl : : curl_progress . erase ( hCurl ) ;
sCurlPool - > ReturnHandler ( hCurl , restore_pool ) ;
hCurl = NULL ;
} else {
return false ;
}
return true ;
2013-07-05 02:28:31 +00:00
}
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : ClearInternalData ( )
2013-07-05 02:28:31 +00:00
{
2020-08-22 12:40:53 +00:00
// Always clear internal data
//
type = REQTYPE_UNSET ;
path = " " ;
base_path = " " ;
saved_path = " " ;
url = " " ;
op = " " ;
query_string = " " ;
if ( requestHeaders ) {
curl_slist_free_all ( requestHeaders ) ;
requestHeaders = NULL ;
}
responseHeaders . clear ( ) ;
bodydata . Clear ( ) ;
headdata . Clear ( ) ;
LastResponseCode = S3FSCURL_RESPONSECODE_NOTSET ;
postdata = NULL ;
postdata_remaining = 0 ;
retry_count = 0 ;
b_infile = NULL ;
b_postdata = NULL ;
b_postdata_remaining = 0 ;
b_partdata_startpos = 0 ;
b_partdata_size = 0 ;
partdata . clear ( ) ;
fpLazySetup = NULL ;
S3FS_MALLOCTRIM ( 0 ) ;
return true ;
2013-07-05 02:28:31 +00:00
}
2011-03-01 19:35:55 +00:00
Fixed Issue 229 and Changes codes
1) Set metadata "Content-Encoding" automatically(Issue 292)
For this issue, s3fs is added new option "ahbe_conf".
New option means the configuration file path, and this file specifies
additional HTTP header by file(object) extension.
Thus you can specify any HTTP header for each object by extension.
* ahbe_conf file format:
-----------
line = [file suffix] HTTP-header [HTTP-header-values]
file suffix = file(object) suffix, if this field is empty,
it means "*"(all object).
HTTP-header = additional HTTP header name
HTTP-header-values = additional HTTP header value
-----------
* Example:
-----------
.gz Content-Encoding gzip
.Z Content-Encoding compress
X-S3FS-MYHTTPHEAD myvalue
-----------
A sample configuration file is uploaded in "test" directory.
If ahbe_conf parameter is specified, s3fs loads it's configuration
and compares extension(suffix) of object(file) when uploading
(PUT/POST) it. If the extension is same, s3fs adds/sends specified
HTTP header and value.
A case of sample configuration file, if a object(it's extension is
".gz") which already has Content-Encoding HTTP header is renamed
to ".txt" extension, s3fs does not set Content-Encoding. Because
".txt" is not match any line in configuration file.
So, s3fs matches the extension by each PUT/POST action.
* Please take care about "Content-Encoding".
This new option allows setting ANY HTTP header by object extension.
For example, you can specify "Content-Encoding" for ".gz"/etc
extension in configuration. But this means that S3 always returns
"Content-Encoding: gzip" when a client requests with other
"Accept-Encoding:" header. It SHOULD NOT be good.
Please see RFC 2616.
2) Changes about allow_other/uid/gid option for mount point
I reviewed about mount point permission and allow_other/uid/gid
options, and found bugs about these.
s3fs is fixed bugs and changed to the following specifications.
* s3fs only allows uid(gid) options as 0(root), when the effective
user is zero(root).
* A mount point(directory) must have a permission to allow
accessing by effective user/group.
* If allow_other option is specified, the mount point permission
is set 0777(all users allow all access).
In another case, the mount point is set 0700(only allows
effective user).
* When uid/gid option is specified, the mount point owner/group
is set uid/gid option value.
If uid/gid is not set, it is set effective user/group id.
This changes maybe fixes some issue(321, 338).
3) Changes a logic about (Issue 229)
The chmod command returns -EIO when changing the mount point.
It is correct, s3fs can not changed owner/group/mtime for the
mount point, but s3fs sends a request for changing the bucket.
This revision does not send the request, and returns EIO as
soon as possible.
git-svn-id: http://s3fs.googlecode.com/svn/trunk@465 df820570-a93a-0410-bd06-b72b767a4274
2013-08-16 19:24:01 +00:00
bool S3fsCurl : : SetUseAhbe ( bool ahbe )
{
2020-08-22 12:40:53 +00:00
bool old = is_use_ahbe ;
is_use_ahbe = ahbe ;
return old ;
Fixed Issue 229 and Changes codes
1) Set metadata "Content-Encoding" automatically(Issue 292)
For this issue, s3fs is added new option "ahbe_conf".
New option means the configuration file path, and this file specifies
additional HTTP header by file(object) extension.
Thus you can specify any HTTP header for each object by extension.
* ahbe_conf file format:
-----------
line = [file suffix] HTTP-header [HTTP-header-values]
file suffix = file(object) suffix, if this field is empty,
it means "*"(all object).
HTTP-header = additional HTTP header name
HTTP-header-values = additional HTTP header value
-----------
* Example:
-----------
.gz Content-Encoding gzip
.Z Content-Encoding compress
X-S3FS-MYHTTPHEAD myvalue
-----------
A sample configuration file is uploaded in "test" directory.
If ahbe_conf parameter is specified, s3fs loads it's configuration
and compares extension(suffix) of object(file) when uploading
(PUT/POST) it. If the extension is same, s3fs adds/sends specified
HTTP header and value.
A case of sample configuration file, if a object(it's extension is
".gz") which already has Content-Encoding HTTP header is renamed
to ".txt" extension, s3fs does not set Content-Encoding. Because
".txt" is not match any line in configuration file.
So, s3fs matches the extension by each PUT/POST action.
* Please take care about "Content-Encoding".
This new option allows setting ANY HTTP header by object extension.
For example, you can specify "Content-Encoding" for ".gz"/etc
extension in configuration. But this means that S3 always returns
"Content-Encoding: gzip" when a client requests with other
"Accept-Encoding:" header. It SHOULD NOT be good.
Please see RFC 2616.
2) Changes about allow_other/uid/gid option for mount point
I reviewed about mount point permission and allow_other/uid/gid
options, and found bugs about these.
s3fs is fixed bugs and changed to the following specifications.
* s3fs only allows uid(gid) options as 0(root), when the effective
user is zero(root).
* A mount point(directory) must have a permission to allow
accessing by effective user/group.
* If allow_other option is specified, the mount point permission
is set 0777(all users allow all access).
In another case, the mount point is set 0700(only allows
effective user).
* When uid/gid option is specified, the mount point owner/group
is set uid/gid option value.
If uid/gid is not set, it is set effective user/group id.
This changes maybe fixes some issue(321, 338).
3) Changes a logic about (Issue 229)
The chmod command returns -EIO when changing the mount point.
It is correct, s3fs can not changed owner/group/mtime for the
mount point, but s3fs sends a request for changing the bucket.
This revision does not send the request, and returns EIO as
soon as possible.
git-svn-id: http://s3fs.googlecode.com/svn/trunk@465 df820570-a93a-0410-bd06-b72b767a4274
2013-08-16 19:24:01 +00:00
}
2019-02-25 12:47:10 +00:00
bool S3fsCurl : : GetResponseCode ( long & responseCode , bool from_curl_handle )
2013-07-05 02:28:31 +00:00
{
2020-08-22 12:40:53 +00:00
responseCode = - 1 ;
2019-02-25 12:47:10 +00:00
2020-08-22 12:40:53 +00:00
if ( ! from_curl_handle ) {
responseCode = LastResponseCode ;
} else {
if ( ! hCurl ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_getinfo ( hCurl , CURLINFO_RESPONSE_CODE , & LastResponseCode ) ) {
return false ;
}
responseCode = LastResponseCode ;
2019-02-25 12:47:10 +00:00
}
2020-08-22 12:40:53 +00:00
return true ;
2013-07-05 02:28:31 +00:00
}
2011-03-01 19:35:55 +00:00
2013-08-21 07:43:32 +00:00
//
// Reset all options for retrying
//
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : RemakeHandle ( )
2013-08-21 07:43:32 +00:00
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " Retry request. [type=%d][url=%s][path=%s] " , type , url . c_str ( ) , path . c_str ( ) ) ;
2013-08-21 07:43:32 +00:00
2020-08-22 12:40:53 +00:00
if ( REQTYPE_UNSET = = type ) {
return false ;
2013-08-21 07:43:32 +00:00
}
2020-08-22 12:40:53 +00:00
// rewind file
struct stat st ;
if ( b_infile ) {
rewind ( b_infile ) ;
if ( - 1 = = fstat ( fileno ( b_infile ) , & st ) ) {
S3FS_PRN_WARN ( " Could not get file stat(fd=%d) " , fileno ( b_infile ) ) ;
return false ;
2011-03-01 19:35:55 +00:00
}
2020-08-22 12:40:53 +00:00
}
2011-03-01 19:35:55 +00:00
2020-08-22 12:40:53 +00:00
// reinitialize internal data
2021-01-04 12:37:34 +00:00
requestHeaders = curl_slist_remove ( requestHeaders , " Authorization " ) ;
2020-08-22 12:40:53 +00:00
responseHeaders . clear ( ) ;
bodydata . Clear ( ) ;
headdata . Clear ( ) ;
LastResponseCode = S3FSCURL_RESPONSECODE_NOTSET ;
// count up(only use for multipart)
retry_count + + ;
// set from backup
postdata = b_postdata ;
postdata_remaining = b_postdata_remaining ;
partdata . startpos = b_partdata_startpos ;
partdata . size = b_partdata_size ;
// reset handle
ResetHandle ( ) ;
// set options
switch ( type ) {
case REQTYPE_DELETE :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_CUSTOMREQUEST , " DELETE " ) ) {
return false ;
}
2019-08-22 15:22:57 +00:00
break ;
2019-01-20 07:17:40 +00:00
2020-08-22 12:40:53 +00:00
case REQTYPE_HEAD :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_NOBODY , true ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_FILETIME , true ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
// responseHeaders
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_HEADERDATA , ( void * ) & responseHeaders ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_HEADERFUNCTION , HeaderCallback ) ) {
return false ;
}
2019-08-22 15:22:57 +00:00
break ;
2011-03-01 19:35:55 +00:00
2020-08-22 12:40:53 +00:00
case REQTYPE_PUTHEAD :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_UPLOAD , true ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_INFILESIZE , 0 ) ) {
return false ;
}
2019-08-22 15:22:57 +00:00
break ;
2011-03-01 19:35:55 +00:00
2020-08-22 12:40:53 +00:00
case REQTYPE_PUT :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_UPLOAD , true ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
if ( b_infile ) {
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_INFILESIZE_LARGE , static_cast < curl_off_t > ( st . st_size ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_INFILE , b_infile ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
} else {
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_INFILESIZE , 0 ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
}
2019-08-22 15:22:57 +00:00
break ;
2011-03-01 19:35:55 +00:00
2020-08-22 12:40:53 +00:00
case REQTYPE_GET :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , S3fsCurl : : DownloadWriteCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) this ) ) {
return false ;
}
2020-03-01 08:41:45 +00:00
break ;
2020-08-22 12:40:53 +00:00
case REQTYPE_CHKBUCKET :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
2019-08-22 15:22:57 +00:00
break ;
2019-02-02 03:46:06 +00:00
2020-08-22 12:40:53 +00:00
case REQTYPE_LISTBUCKET :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
2019-02-02 03:46:06 +00:00
break ;
2020-08-22 12:40:53 +00:00
case REQTYPE_PREMULTIPOST :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_POST , true ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_POSTFIELDSIZE , 0 ) ) {
return false ;
}
2019-08-22 15:22:57 +00:00
break ;
2011-03-01 19:35:55 +00:00
2020-08-22 12:40:53 +00:00
case REQTYPE_COMPLETEMULTIPOST :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_POST , true ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_POSTFIELDSIZE , static_cast < curl_off_t > ( postdata_remaining ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_READDATA , ( void * ) this ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_READFUNCTION , S3fsCurl : : ReadCallback ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
break ;
2011-03-01 19:35:55 +00:00
2020-08-22 12:40:53 +00:00
case REQTYPE_UPLOADMULTIPOST :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_UPLOAD , true ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_HEADERDATA , ( void * ) & responseHeaders ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_HEADERFUNCTION , HeaderCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_INFILESIZE_LARGE , static_cast < curl_off_t > ( partdata . size ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_READFUNCTION , S3fsCurl : : UploadReadCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_READDATA , ( void * ) this ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
break ;
2011-03-01 19:35:55 +00:00
2020-08-22 12:40:53 +00:00
case REQTYPE_COPYMULTIPOST :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_UPLOAD , true ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_HEADERDATA , ( void * ) & headdata ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_HEADERFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_INFILESIZE , 0 ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
break ;
2011-03-01 19:35:55 +00:00
2020-08-22 12:40:53 +00:00
case REQTYPE_MULTILIST :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
break ;
case REQTYPE_IAMCRED :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
if ( S3fsCurl : : is_ibm_iam_auth ) {
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_POST , true ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_POSTFIELDSIZE , static_cast < curl_off_t > ( postdata_remaining ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_READDATA , ( void * ) this ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_READFUNCTION , S3fsCurl : : ReadCallback ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
}
break ;
case REQTYPE_ABORTMULTIUPLOAD :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_CUSTOMREQUEST , " DELETE " ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
break ;
case REQTYPE_IAMROLE :
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
break ;
default :
S3FS_PRN_ERR ( " request type is unknown(%d) " , type ) ;
return false ;
}
2021-09-01 23:07:06 +00:00
if ( ! S3fsCurl : : AddUserAgent ( hCurl ) ) { // put User-Agent
return false ;
}
2020-08-22 12:40:53 +00:00
return true ;
}
//
// returns curl return code
//
int S3fsCurl : : RequestPerform ( bool dontAddAuthHeaders /*=false*/ )
{
2020-10-13 14:00:11 +00:00
if ( S3fsLog : : IsS3fsLogDbg ( ) ) {
2020-08-22 12:40:53 +00:00
char * ptr_url = NULL ;
curl_easy_getinfo ( hCurl , CURLINFO_EFFECTIVE_URL , & ptr_url ) ;
S3FS_PRN_DBG ( " connecting to URL %s " , SAFESTRPTR ( ptr_url ) ) ;
}
LastResponseCode = S3FSCURL_RESPONSECODE_NOTSET ;
2021-07-14 13:18:09 +00:00
long responseCode = S3FSCURL_RESPONSECODE_NOTSET ;
2020-08-22 12:40:53 +00:00
int result = S3FSCURL_PERFORM_RESULT_NOTSET ;
// 1 attempt + retries...
for ( int retrycnt = 0 ; S3FSCURL_PERFORM_RESULT_NOTSET = = result & & retrycnt < S3fsCurl : : retries ; + + retrycnt ) {
// Reset response code
responseCode = S3FSCURL_RESPONSECODE_NOTSET ;
2020-12-22 11:54:04 +00:00
// Insert headers
if ( ! dontAddAuthHeaders ) {
insertAuthHeaders ( ) ;
}
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_HTTPHEADER , requestHeaders ) ) {
return false ;
}
2020-08-22 12:40:53 +00:00
// Requests
2021-01-04 13:57:56 +00:00
curlCode = curl_easy_perform ( hCurl ) ;
2020-08-22 12:40:53 +00:00
// Check result
switch ( curlCode ) {
case CURLE_OK :
// Need to look at the HTTP response code
if ( 0 ! = curl_easy_getinfo ( hCurl , CURLINFO_RESPONSE_CODE , & responseCode ) ) {
S3FS_PRN_ERR ( " curl_easy_getinfo failed while trying to retrieve HTTP response code " ) ;
responseCode = S3FSCURL_RESPONSECODE_FATAL_ERROR ;
result = - EIO ;
break ;
}
if ( responseCode > = 200 & & responseCode < 300 ) {
S3FS_PRN_INFO3 ( " HTTP response code %ld " , responseCode ) ;
result = 0 ;
break ;
}
2021-02-07 02:29:08 +00:00
{
// Try to parse more specific AWS error code otherwise fall back to HTTP error code.
std : : string value ;
if ( simple_parse_xml ( bodydata . str ( ) , bodydata . size ( ) , " Code " , value ) ) {
// TODO: other error codes
if ( value = = " EntityTooLarge " ) {
result = - EFBIG ;
break ;
2021-06-30 00:25:36 +00:00
} else if ( value = = " InvalidObjectState " ) {
result = - EREMOTE ;
break ;
2021-03-28 04:17:41 +00:00
} else if ( value = = " KeyTooLongError " ) {
result = - ENAMETOOLONG ;
break ;
2021-02-07 02:29:08 +00:00
}
}
}
2020-08-22 12:40:53 +00:00
// Service response codes which are >= 300 && < 500
switch ( responseCode ) {
case 301 :
case 307 :
S3FS_PRN_ERR ( " HTTP response code 301(Moved Permanently: also happens when bucket's region is incorrect), returning EIO. Body Text: %s " , bodydata . str ( ) ) ;
S3FS_PRN_ERR ( " The options of url and endpoint may be useful for solving, please try to use both options. " ) ;
result = - EIO ;
break ;
case 400 :
2021-02-13 05:13:37 +00:00
if ( op = = " HEAD " ) {
2021-03-28 04:17:41 +00:00
if ( path . size ( ) > 1024 ) {
S3FS_PRN_ERR ( " HEAD HTTP response code %ld with path longer than 1024, returning ENAMETOOLONG. " , responseCode ) ;
return - ENAMETOOLONG ;
}
S3FS_PRN_ERR ( " HEAD HTTP response code %ld, returning EPERM. " , responseCode ) ;
2021-02-13 05:13:37 +00:00
result = - EPERM ;
} else {
S3FS_PRN_ERR ( " HTTP response code %ld, returning EIO. Body Text: %s " , responseCode , bodydata . str ( ) ) ;
result = - EIO ;
}
2020-08-22 12:40:53 +00:00
break ;
case 403 :
S3FS_PRN_ERR ( " HTTP response code %ld, returning EPERM. Body Text: %s " , responseCode , bodydata . str ( ) ) ;
result = - EPERM ;
break ;
case 404 :
S3FS_PRN_INFO3 ( " HTTP response code 404 was returned, returning ENOENT " ) ;
S3FS_PRN_DBG ( " Body Text: %s " , bodydata . str ( ) ) ;
result = - ENOENT ;
break ;
case 416 :
S3FS_PRN_INFO3 ( " HTTP response code 416 was returned, returning EIO " ) ;
result = - EIO ;
break ;
case 501 :
S3FS_PRN_INFO3 ( " HTTP response code 501 was returned, returning ENOTSUP " ) ;
S3FS_PRN_DBG ( " Body Text: %s " , bodydata . str ( ) ) ;
result = - ENOTSUP ;
break ;
case 500 :
2021-07-25 01:22:19 +00:00
case 503 : {
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " HTTP response code %ld was returned, slowing down " , responseCode ) ;
S3FS_PRN_DBG ( " Body Text: %s " , bodydata . str ( ) ) ;
2021-07-25 01:22:19 +00:00
// Add jitter to avoid thundering herd.
unsigned int sleep_time = 2 < < retry_count ;
2021-08-31 10:36:02 +00:00
sleep ( sleep_time + static_cast < unsigned int > ( random ( ) ) % sleep_time ) ;
2020-08-22 12:40:53 +00:00
break ;
2021-07-25 01:22:19 +00:00
}
2020-08-22 12:40:53 +00:00
default :
S3FS_PRN_ERR ( " HTTP response code %ld, returning EIO. Body Text: %s " , responseCode , bodydata . str ( ) ) ;
result = - EIO ;
break ;
}
break ;
case CURLE_WRITE_ERROR :
S3FS_PRN_ERR ( " ### CURLE_WRITE_ERROR " ) ;
sleep ( 2 ) ;
break ;
case CURLE_OPERATION_TIMEDOUT :
S3FS_PRN_ERR ( " ### CURLE_OPERATION_TIMEDOUT " ) ;
sleep ( 2 ) ;
break ;
case CURLE_COULDNT_RESOLVE_HOST :
S3FS_PRN_ERR ( " ### CURLE_COULDNT_RESOLVE_HOST " ) ;
sleep ( 2 ) ;
break ;
case CURLE_COULDNT_CONNECT :
S3FS_PRN_ERR ( " ### CURLE_COULDNT_CONNECT " ) ;
sleep ( 4 ) ;
break ;
case CURLE_GOT_NOTHING :
S3FS_PRN_ERR ( " ### CURLE_GOT_NOTHING " ) ;
sleep ( 4 ) ;
break ;
case CURLE_ABORTED_BY_CALLBACK :
S3FS_PRN_ERR ( " ### CURLE_ABORTED_BY_CALLBACK " ) ;
sleep ( 4 ) ;
{
AutoLock lock ( & S3fsCurl : : curl_handles_lock ) ;
S3fsCurl : : curl_times [ hCurl ] = time ( 0 ) ;
}
break ;
case CURLE_PARTIAL_FILE :
S3FS_PRN_ERR ( " ### CURLE_PARTIAL_FILE " ) ;
sleep ( 4 ) ;
break ;
case CURLE_SEND_ERROR :
S3FS_PRN_ERR ( " ### CURLE_SEND_ERROR " ) ;
sleep ( 2 ) ;
break ;
case CURLE_RECV_ERROR :
S3FS_PRN_ERR ( " ### CURLE_RECV_ERROR " ) ;
sleep ( 2 ) ;
break ;
case CURLE_SSL_CONNECT_ERROR :
S3FS_PRN_ERR ( " ### CURLE_SSL_CONNECT_ERROR " ) ;
sleep ( 2 ) ;
break ;
case CURLE_SSL_CACERT :
S3FS_PRN_ERR ( " ### CURLE_SSL_CACERT " ) ;
// try to locate cert, if successful, then set the
// option and continue
if ( S3fsCurl : : curl_ca_bundle . empty ( ) ) {
if ( ! S3fsCurl : : LocateBundle ( ) ) {
S3FS_PRN_ERR ( " could not get CURL_CA_BUNDLE. " ) ;
result = - EIO ;
}
// retry with CAINFO
} else {
S3FS_PRN_ERR ( " curlCode: %d msg: %s " , curlCode , curl_easy_strerror ( curlCode ) ) ;
result = - EIO ;
}
break ;
# ifdef CURLE_PEER_FAILED_VERIFICATION
case CURLE_PEER_FAILED_VERIFICATION :
S3FS_PRN_ERR ( " ### CURLE_PEER_FAILED_VERIFICATION " ) ;
2020-10-01 09:24:45 +00:00
first_pos = bucket . find_first_of ( ' . ' ) ;
2020-09-11 09:37:24 +00:00
if ( first_pos ! = std : : string : : npos ) {
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO ( " curl returned a CURL_PEER_FAILED_VERIFICATION error " ) ;
S3FS_PRN_INFO ( " security issue found: buckets with periods in their name are incompatible with http " ) ;
S3FS_PRN_INFO ( " This check can be over-ridden by using the -o ssl_verify_hostname=0 " ) ;
S3FS_PRN_INFO ( " The certificate will still be checked but the hostname will not be verified. " ) ;
S3FS_PRN_INFO ( " A more secure method would be to use a bucket name without periods. " ) ;
} else {
S3FS_PRN_INFO ( " my_curl_easy_perform: curlCode: %d -- %s " , curlCode , curl_easy_strerror ( curlCode ) ) ;
}
result = - EIO ;
break ;
# endif
// This should be invalid since curl option HTTP FAILONERROR is now off
case CURLE_HTTP_RETURNED_ERROR :
S3FS_PRN_ERR ( " ### CURLE_HTTP_RETURNED_ERROR " ) ;
if ( 0 ! = curl_easy_getinfo ( hCurl , CURLINFO_RESPONSE_CODE , & responseCode ) ) {
result = - EIO ;
} else {
S3FS_PRN_INFO3 ( " HTTP response code =%ld " , responseCode ) ;
// Let's try to retrieve the
if ( 404 = = responseCode ) {
result = - ENOENT ;
} else if ( 500 > responseCode ) {
result = - EIO ;
}
}
break ;
// Unknown CURL return code
default :
S3FS_PRN_ERR ( " ###curlCode: %d msg: %s " , curlCode , curl_easy_strerror ( curlCode ) ) ;
result = - EIO ;
break ;
} // switch
if ( S3FSCURL_PERFORM_RESULT_NOTSET = = result ) {
S3FS_PRN_INFO ( " ### retrying... " ) ;
if ( ! RemakeHandle ( ) ) {
S3FS_PRN_INFO ( " Failed to reset handle and internal data for retrying. " ) ;
result = - EIO ;
break ;
}
2011-03-01 19:35:55 +00:00
}
2020-08-22 12:40:53 +00:00
} // for
2011-03-01 19:35:55 +00:00
2020-08-22 12:40:53 +00:00
// set last response code
if ( S3FSCURL_RESPONSECODE_NOTSET = = responseCode ) {
LastResponseCode = S3FSCURL_RESPONSECODE_FATAL_ERROR ;
} else {
LastResponseCode = responseCode ;
2011-03-01 19:35:55 +00:00
}
2013-08-21 07:43:32 +00:00
2019-08-22 15:22:57 +00:00
if ( S3FSCURL_PERFORM_RESULT_NOTSET = = result ) {
2020-08-22 12:40:53 +00:00
S3FS_PRN_ERR ( " ### giving up " ) ;
2019-08-22 15:22:57 +00:00
result = - EIO ;
2020-08-22 12:40:53 +00:00
}
return result ;
2011-03-01 19:35:55 +00:00
}
2013-07-05 02:28:31 +00:00
//
// Returns the Amazon AWS signature for the given parameters.
//
// @param method e.g., "GET"
// @param content_type e.g., "application/x-directory"
2015-01-28 17:13:11 +00:00
// @param date e.g., get_date_rfc850()
2013-07-05 02:28:31 +00:00
// @param resource e.g., "/pub"
//
2020-09-11 09:37:24 +00:00
std : : string S3fsCurl : : CalcSignatureV2 ( const std : : string & method , const std : : string & strMD5 , const std : : string & content_type , const std : : string & date , const std : : string & resource )
2015-01-20 16:31:36 +00:00
{
2020-09-11 09:37:24 +00:00
std : : string Signature ;
std : : string StringToSign ;
2020-08-22 12:40:53 +00:00
if ( ! S3fsCurl : : IAM_role . empty ( ) | | S3fsCurl : : is_ecs | | S3fsCurl : : is_use_session_token ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-security-token " , S3fsCurl : : AWSAccessToken . c_str ( ) ) ;
}
StringToSign + = method + " \n " ;
StringToSign + = strMD5 + " \n " ; // md5
StringToSign + = content_type + " \n " ;
StringToSign + = date + " \n " ;
StringToSign + = get_canonical_headers ( requestHeaders , true ) ;
StringToSign + = resource ;
const void * key = S3fsCurl : : AWSSecretAccessKey . data ( ) ;
2021-06-13 03:50:07 +00:00
size_t key_len = S3fsCurl : : AWSSecretAccessKey . size ( ) ;
2020-08-22 12:40:53 +00:00
const unsigned char * sdata = reinterpret_cast < const unsigned char * > ( StringToSign . data ( ) ) ;
2021-06-13 03:50:07 +00:00
size_t sdata_len = StringToSign . size ( ) ;
2020-08-22 12:40:53 +00:00
unsigned char * md = NULL ;
unsigned int md_len = 0 ; ;
s3fs_HMAC ( key , key_len , sdata , sdata_len , & md , & md_len ) ;
char * base64 ;
if ( NULL = = ( base64 = s3fs_base64 ( md , md_len ) ) ) {
delete [ ] md ;
2020-09-11 09:37:24 +00:00
return std : : string ( " " ) ; // ENOMEM
2020-08-22 12:40:53 +00:00
}
2019-07-02 19:33:01 +00:00
delete [ ] md ;
2015-01-20 16:31:36 +00:00
2020-08-22 12:40:53 +00:00
Signature = base64 ;
delete [ ] base64 ;
2015-01-20 16:31:36 +00:00
2020-08-22 12:40:53 +00:00
return Signature ;
2015-01-20 16:31:36 +00:00
}
2020-09-11 09:37:24 +00:00
std : : string S3fsCurl : : CalcSignature ( const std : : string & method , const std : : string & canonical_uri , const std : : string & query_string , const std : : string & strdate , const std : : string & payload_hash , const std : : string & date8601 )
2015-01-20 16:31:36 +00:00
{
2020-09-11 09:37:24 +00:00
std : : string Signature , StringCQ , StringToSign ;
std : : string uriencode ;
2020-08-22 12:40:53 +00:00
if ( ! S3fsCurl : : IAM_role . empty ( ) | | S3fsCurl : : is_ecs | | S3fsCurl : : is_use_session_token ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-security-token " , S3fsCurl : : AWSAccessToken . c_str ( ) ) ;
}
uriencode = urlEncode ( canonical_uri ) ;
StringCQ = method + " \n " ;
2020-09-17 10:56:35 +00:00
if ( method = = " HEAD " | | method = = " PUT " | | method = = " DELETE " ) {
2020-08-22 12:40:53 +00:00
StringCQ + = uriencode + " \n " ;
2020-09-17 10:56:35 +00:00
} else if ( method = = " GET " & & uriencode . empty ( ) ) {
2020-08-22 12:40:53 +00:00
StringCQ + = " / \n " ;
2020-10-01 10:53:48 +00:00
} else if ( method = = " GET " & & is_prefix ( uriencode . c_str ( ) , " / " ) ) {
2020-08-22 12:40:53 +00:00
StringCQ + = uriencode + " \n " ;
2020-10-01 10:53:48 +00:00
} else if ( method = = " GET " & & ! is_prefix ( uriencode . c_str ( ) , " / " ) ) {
2020-08-22 12:40:53 +00:00
StringCQ + = " / \n " + urlEncode2 ( canonical_uri ) + " \n " ;
2020-09-17 10:56:35 +00:00
} else if ( method = = " POST " ) {
2020-08-22 12:40:53 +00:00
StringCQ + = uriencode + " \n " ;
}
StringCQ + = urlEncode2 ( query_string ) + " \n " ;
StringCQ + = get_canonical_headers ( requestHeaders ) + " \n " ;
StringCQ + = get_sorted_header_keys ( requestHeaders ) + " \n " ;
StringCQ + = payload_hash ;
2021-05-06 13:24:38 +00:00
std : : string kSecret = " AWS4 " + S3fsCurl : : AWSSecretAccessKey ;
2020-08-22 12:40:53 +00:00
unsigned char * kDate , * kRegion , * kService , * kSigning , * sRequest = NULL ;
unsigned int kDate_len , kRegion_len , kService_len , kSigning_len , sRequest_len = 0 ;
2021-05-06 13:24:38 +00:00
s3fs_HMAC256 ( kSecret . c_str ( ) , kSecret . size ( ) , reinterpret_cast < const unsigned char * > ( strdate . data ( ) ) , strdate . size ( ) , & kDate , & kDate_len ) ;
2020-08-22 12:40:53 +00:00
s3fs_HMAC256 ( kDate , kDate_len , reinterpret_cast < const unsigned char * > ( endpoint . c_str ( ) ) , endpoint . size ( ) , & kRegion , & kRegion_len ) ;
s3fs_HMAC256 ( kRegion , kRegion_len , reinterpret_cast < const unsigned char * > ( " s3 " ) , sizeof ( " s3 " ) - 1 , & kService , & kService_len ) ;
s3fs_HMAC256 ( kService , kService_len , reinterpret_cast < const unsigned char * > ( " aws4_request " ) , sizeof ( " aws4_request " ) - 1 , & kSigning , & kSigning_len ) ;
delete [ ] kDate ;
delete [ ] kRegion ;
delete [ ] kService ;
const unsigned char * cRequest = reinterpret_cast < const unsigned char * > ( StringCQ . c_str ( ) ) ;
2021-06-13 03:50:07 +00:00
size_t cRequest_len = StringCQ . size ( ) ;
2020-08-22 12:40:53 +00:00
s3fs_sha256 ( cRequest , cRequest_len , & sRequest , & sRequest_len ) ;
StringToSign = " AWS4-HMAC-SHA256 \n " ;
StringToSign + = date8601 + " \n " ;
StringToSign + = strdate + " / " + endpoint + " /s3/aws4_request \n " ;
2021-08-03 22:28:51 +00:00
StringToSign + = s3fs_hex_lower ( sRequest , sRequest_len ) ;
2020-09-14 10:09:48 +00:00
delete [ ] sRequest ;
2020-08-22 12:40:53 +00:00
const unsigned char * cscope = reinterpret_cast < const unsigned char * > ( StringToSign . c_str ( ) ) ;
2021-06-13 03:50:07 +00:00
size_t cscope_len = StringToSign . size ( ) ;
2020-08-22 12:40:53 +00:00
unsigned char * md = NULL ;
unsigned int md_len = 0 ;
s3fs_HMAC256 ( kSigning , kSigning_len , cscope , cscope_len , & md , & md_len ) ;
delete [ ] kSigning ;
2021-08-03 22:28:51 +00:00
Signature = s3fs_hex_lower ( md , md_len ) ;
2020-09-14 10:09:48 +00:00
delete [ ] md ;
2020-08-22 12:40:53 +00:00
return Signature ;
2015-01-20 16:31:36 +00:00
}
2017-10-26 14:21:48 +00:00
void S3fsCurl : : insertV4Headers ( )
2015-04-09 22:04:39 +00:00
{
2020-09-11 09:37:24 +00:00
std : : string server_path = type = = REQTYPE_LISTBUCKET ? " / " : path ;
std : : string payload_hash ;
2020-08-22 12:40:53 +00:00
switch ( type ) {
case REQTYPE_PUT :
2021-11-01 14:33:55 +00:00
if ( GetUnsignedPayload ( ) ) {
payload_hash = " UNSIGNED-PAYLOAD " ;
} else {
payload_hash = s3fs_sha256_hex_fd ( b_infile = = NULL ? - 1 : fileno ( b_infile ) , 0 , - 1 ) ;
}
2020-08-22 12:40:53 +00:00
break ;
case REQTYPE_COMPLETEMULTIPOST :
{
2021-06-13 03:50:07 +00:00
size_t cRequest_len = strlen ( reinterpret_cast < const char * > ( b_postdata ) ) ;
2020-08-22 12:40:53 +00:00
unsigned char * sRequest = NULL ;
unsigned int sRequest_len = 0 ;
s3fs_sha256 ( b_postdata , cRequest_len , & sRequest , & sRequest_len ) ;
2021-08-03 22:28:51 +00:00
payload_hash = s3fs_hex_lower ( sRequest , sRequest_len ) ;
2020-08-22 12:40:53 +00:00
delete [ ] sRequest ;
break ;
}
case REQTYPE_UPLOADMULTIPOST :
2021-11-01 14:33:55 +00:00
if ( GetUnsignedPayload ( ) ) {
payload_hash = " UNSIGNED-PAYLOAD " ;
} else {
payload_hash = s3fs_sha256_hex_fd ( partdata . fd , partdata . startpos , partdata . size ) ;
}
2020-08-22 12:40:53 +00:00
break ;
default :
break ;
}
2021-06-13 11:03:10 +00:00
if ( b_infile ! = NULL & & payload_hash . empty ( ) ) {
2021-01-25 10:08:14 +00:00
S3FS_PRN_ERR ( " Failed to make SHA256. " ) ;
// TODO: propagate error
}
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " computing signature [%s] [%s] [%s] [%s] " , op . c_str ( ) , server_path . c_str ( ) , query_string . c_str ( ) , payload_hash . c_str ( ) ) ;
2020-09-11 09:37:24 +00:00
std : : string strdate ;
std : : string date8601 ;
2020-08-22 12:40:53 +00:00
get_date_sigv3 ( strdate , date8601 ) ;
2021-08-02 15:10:27 +00:00
std : : string contentSHA256 = payload_hash . empty ( ) ? EMPTY_PAYLOAD_HASH : payload_hash ;
2020-08-22 12:40:53 +00:00
const std : : string realpath = pathrequeststyle ? " / " + bucket + server_path : server_path ;
//string canonical_headers, signed_headers;
requestHeaders = curl_slist_sort_insert ( requestHeaders , " host " , get_bucket_host ( ) . c_str ( ) ) ;
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-content-sha256 " , contentSHA256 . c_str ( ) ) ;
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-date " , date8601 . c_str ( ) ) ;
if ( S3fsCurl : : IsRequesterPays ( ) ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-request-payer " , " requester " ) ;
}
if ( ! S3fsCurl : : IsPublicBucket ( ) ) {
2020-09-11 09:37:24 +00:00
std : : string Signature = CalcSignature ( op , realpath , query_string + ( type = = REQTYPE_PREMULTIPOST | | type = = REQTYPE_MULTILIST ? " = " : " " ) , strdate , contentSHA256 , date8601 ) ;
std : : string auth = " AWS4-HMAC-SHA256 Credential= " + AWSAccessKeyId + " / " + strdate + " / " + endpoint + " /s3/aws4_request, SignedHeaders= " + get_sorted_header_keys ( requestHeaders ) + " , Signature= " + Signature ;
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Authorization " , auth . c_str ( ) ) ;
}
2015-04-09 22:04:39 +00:00
}
2017-10-26 14:21:48 +00:00
void S3fsCurl : : insertV2Headers ( )
{
2020-09-11 09:37:24 +00:00
std : : string resource ;
std : : string turl ;
std : : string server_path = type = = REQTYPE_LISTBUCKET ? " / " : path ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( server_path . c_str ( ) , resource , turl ) ;
2021-02-23 00:45:13 +00:00
if ( ! query_string . empty ( ) & & type ! = REQTYPE_CHKBUCKET & & type ! = REQTYPE_LISTBUCKET ) {
2020-08-22 12:40:53 +00:00
resource + = " ? " + query_string ;
}
2020-09-11 09:37:24 +00:00
std : : string date = get_date_rfc850 ( ) ;
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Date " , date . c_str ( ) ) ;
if ( op ! = " PUT " & & op ! = " POST " ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Content-Type " , NULL ) ;
}
if ( ! S3fsCurl : : IsPublicBucket ( ) ) {
2020-09-11 09:37:24 +00:00
std : : string Signature = CalcSignatureV2 ( op , get_header_value ( requestHeaders , " Content-MD5 " ) , get_header_value ( requestHeaders , " Content-Type " ) , date , resource ) ;
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Authorization " , std : : string ( " AWS " + AWSAccessKeyId + " : " + Signature ) . c_str ( ) ) ;
2020-08-22 12:40:53 +00:00
}
2017-10-26 14:21:48 +00:00
}
2017-11-23 08:46:24 +00:00
void S3fsCurl : : insertIBMIAMHeaders ( )
{
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Authorization " , ( " Bearer " + S3fsCurl : : AWSAccessToken ) . c_str ( ) ) ;
2017-11-23 08:46:24 +00:00
2020-08-22 12:40:53 +00:00
if ( op = = " PUT " & & path = = mount_prefix + " / " ) {
// ibm-service-instance-id header is required for bucket creation requests
requestHeaders = curl_slist_sort_insert ( requestHeaders , " ibm-service-instance-id " , S3fsCurl : : AWSAccessKeyId . c_str ( ) ) ;
}
2017-11-23 08:46:24 +00:00
}
2017-10-26 14:21:48 +00:00
void S3fsCurl : : insertAuthHeaders ( )
{
2020-08-22 12:40:53 +00:00
if ( ! S3fsCurl : : CheckIAMCredentialUpdate ( ) ) {
S3FS_PRN_ERR ( " An error occurred in checking IAM credential. " ) ;
return ; // do not insert auth headers on error
}
if ( S3fsCurl : : is_ibm_iam_auth ) {
insertIBMIAMHeaders ( ) ;
2020-10-01 09:50:49 +00:00
} else if ( S3fsCurl : : signature_type = = V2_ONLY ) {
2020-08-22 12:40:53 +00:00
insertV2Headers ( ) ;
} else {
insertV4Headers ( ) ;
}
2017-10-26 14:21:48 +00:00
}
2013-07-05 02:28:31 +00:00
int S3fsCurl : : DeleteRequest ( const char * tpath )
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s] " , SAFESTRPTR ( tpath ) ) ;
if ( ! tpath ) {
2021-01-18 09:50:49 +00:00
return - EINVAL ;
2020-08-22 12:40:53 +00:00
}
if ( ! CreateCurlHandle ( ) ) {
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2020-09-11 09:37:24 +00:00
std : : string resource ;
std : : string turl ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( get_realpath ( tpath ) . c_str ( ) , resource , turl ) ;
url = prepare_url ( turl . c_str ( ) ) ;
path = get_realpath ( tpath ) ;
requestHeaders = NULL ;
responseHeaders . clear ( ) ;
op = " DELETE " ;
type = REQTYPE_DELETE ;
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_CUSTOMREQUEST , " DELETE " ) ) {
return - EIO ;
}
if ( ! S3fsCurl : : AddUserAgent ( hCurl ) ) { // put User-Agent
return - EIO ;
}
2020-08-22 12:40:53 +00:00
return RequestPerform ( ) ;
2011-03-01 19:35:55 +00:00
}
2013-03-30 13:37:14 +00:00
2020-10-15 17:18:19 +00:00
//
// Get the token that we need to pass along with AWS IMDSv2 API requests
//
int S3fsCurl : : GetIAMv2ApiToken ( )
{
url = std : : string ( S3fsCurl : : IAMv2_token_url ) ;
if ( ! CreateCurlHandle ( ) ) {
return - EIO ;
}
requestHeaders = NULL ;
responseHeaders . clear ( ) ;
bodydata . Clear ( ) ;
2021-05-07 17:48:47 +00:00
std : : string ttlstr = str ( S3fsCurl : : IAMv2_token_ttl ) ;
2020-10-15 17:18:19 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , S3fsCurl : : IAMv2_token_ttl_hdr . c_str ( ) ,
2021-05-07 17:48:47 +00:00
ttlstr . c_str ( ) ) ;
2021-09-08 10:45:15 +00:00
// Curl appends an "Expect: 100-continue" header to the token request,
// and aws responds with a 417 Expectation Failed. This ensures the
// Expect header is empty before the request is sent.
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Expect " , " " ) ;
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_PUT , true ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return - EIO ;
}
if ( ! S3fsCurl : : AddUserAgent ( hCurl ) ) { // put User-Agent
return - EIO ;
}
2020-10-15 17:18:19 +00:00
int result = RequestPerform ( true ) ;
if ( 0 = = result & & ! S3fsCurl : : SetIAMv2APIToken ( bodydata . str ( ) ) ) {
S3FS_PRN_ERR ( " Error storing IMDSv2 API token. " ) ;
result = - EIO ;
}
bodydata . Clear ( ) ;
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_PUT , false ) ) {
return - EIO ;
}
2020-10-15 17:18:19 +00:00
return result ;
}
2013-10-06 13:45:32 +00:00
//
// Get AccessKeyId/SecretAccessKey/AccessToken/Expiration by IAM role,
2014-04-05 05:11:55 +00:00
// and Set these value to class variable.
2013-10-06 13:45:32 +00:00
//
2019-01-23 23:44:50 +00:00
int S3fsCurl : : GetIAMCredentials ( )
2013-10-06 13:45:32 +00:00
{
2020-08-22 12:40:53 +00:00
if ( ! S3fsCurl : : is_ecs & & ! S3fsCurl : : is_ibm_iam_auth ) {
S3FS_PRN_INFO3 ( " [IAM role=%s] " , S3fsCurl : : IAM_role . c_str ( ) ) ;
2013-10-06 13:45:32 +00:00
2020-08-22 12:40:53 +00:00
if ( S3fsCurl : : IAM_role . empty ( ) ) {
S3FS_PRN_ERR ( " IAM role name is empty. " ) ;
return - EIO ;
}
2017-11-06 21:45:58 +00:00
}
2020-08-22 12:40:53 +00:00
// at first set type for handle
type = REQTYPE_IAMCRED ;
2013-10-06 13:45:32 +00:00
2020-08-22 12:40:53 +00:00
if ( ! CreateCurlHandle ( ) ) {
return - EIO ;
}
2013-10-06 13:45:32 +00:00
2020-08-22 12:40:53 +00:00
// url
if ( is_ecs ) {
2021-08-02 15:10:27 +00:00
const char * env = std : : getenv ( ECS_IAM_ENV_VAR ) ;
2020-09-17 23:12:47 +00:00
if ( env = = NULL ) {
2021-08-02 15:10:27 +00:00
S3FS_PRN_ERR ( " %s is not set. " , ECS_IAM_ENV_VAR ) ;
2020-09-17 23:12:47 +00:00
return - EIO ;
}
url = std : : string ( S3fsCurl : : IAM_cred_url ) + env ;
2020-08-22 12:40:53 +00:00
} else {
2020-10-30 16:59:55 +00:00
if ( S3fsCurl : : IAM_api_version > 1 ) {
int result = GetIAMv2ApiToken ( ) ;
if ( - ENOENT = = result ) {
// If we get a 404 back when requesting the token service,
// then it's highly likely we're running in an environment
// that doesn't support the AWS IMDSv2 API, so we'll skip
// the token retrieval in the future.
SetIMDSVersion ( 1 ) ;
} else if ( result ! = 0 ) {
// If we get an unexpected error when retrieving the API
// token, log it but continue. Requirement for including
// an API token with the metadata request may or may not
// be required, so we should not abort here.
S3FS_PRN_ERR ( " AWS IMDSv2 token retrieval failed: %d " , result ) ;
}
}
2020-09-11 09:37:24 +00:00
url = std : : string ( S3fsCurl : : IAM_cred_url ) + S3fsCurl : : IAM_role ;
2020-08-22 12:40:53 +00:00
}
2017-11-06 21:45:58 +00:00
2020-08-22 12:40:53 +00:00
requestHeaders = NULL ;
responseHeaders . clear ( ) ;
bodydata . Clear ( ) ;
2020-09-11 09:37:24 +00:00
std : : string postContent ;
2017-11-23 08:46:24 +00:00
2020-08-22 12:40:53 +00:00
if ( S3fsCurl : : is_ibm_iam_auth ) {
2020-09-11 09:37:24 +00:00
url = std : : string ( S3fsCurl : : IAM_cred_url ) ;
2017-11-23 08:46:24 +00:00
2020-08-22 12:40:53 +00:00
// make contents
postContent + = " grant_type=urn:ibm:params:oauth:grant-type:apikey " ;
postContent + = " &response_type=cloud_iam " ;
postContent + = " &apikey= " + S3fsCurl : : AWSSecretAccessKey ;
2017-11-23 08:46:24 +00:00
2020-08-22 12:40:53 +00:00
// set postdata
postdata = reinterpret_cast < const unsigned char * > ( postContent . c_str ( ) ) ;
b_postdata = postdata ;
postdata_remaining = postContent . size ( ) ; // without null
b_postdata_remaining = postdata_remaining ;
2017-11-23 08:46:24 +00:00
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Authorization " , " Basic Yng6Yng= " ) ;
2017-11-23 08:46:24 +00:00
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_POST , true ) ) { // POST
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_POSTFIELDSIZE , static_cast < curl_off_t > ( postdata_remaining ) ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_READDATA , ( void * ) this ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_READFUNCTION , S3fsCurl : : ReadCallback ) ) {
return - EIO ;
}
2020-08-22 12:40:53 +00:00
}
2013-10-06 13:45:32 +00:00
2020-10-15 17:18:19 +00:00
if ( S3fsCurl : : IAM_api_version > 1 ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , S3fsCurl : : IAMv2_token_hdr . c_str ( ) , S3fsCurl : : IAMv2_api_token . c_str ( ) ) ;
}
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return - EIO ;
}
if ( ! S3fsCurl : : AddUserAgent ( hCurl ) ) { // put User-Agent
return - EIO ;
}
2013-10-06 13:45:32 +00:00
2020-08-22 12:40:53 +00:00
int result = RequestPerform ( true ) ;
2013-10-06 13:45:32 +00:00
2020-08-22 12:40:53 +00:00
// analyzing response
if ( 0 = = result & & ! S3fsCurl : : SetIAMCredentials ( bodydata . str ( ) ) ) {
S3FS_PRN_ERR ( " Something error occurred, could not get IAM credential. " ) ;
result = - EIO ;
}
bodydata . Clear ( ) ;
2013-10-06 13:45:32 +00:00
2020-08-22 12:40:53 +00:00
return result ;
2013-10-06 13:45:32 +00:00
}
2016-05-06 04:37:32 +00:00
//
// Get IAM role name automatically.
//
2019-01-23 23:44:50 +00:00
bool S3fsCurl : : LoadIAMRoleFromMetaData ( )
2016-05-06 04:37:32 +00:00
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " Get IAM Role name " ) ;
2016-05-06 04:37:32 +00:00
2020-08-22 12:40:53 +00:00
// at first set type for handle
type = REQTYPE_IAMROLE ;
2016-05-06 04:37:32 +00:00
2020-08-22 12:40:53 +00:00
if ( ! CreateCurlHandle ( ) ) {
return false ;
}
2016-05-06 04:37:32 +00:00
2020-08-22 12:40:53 +00:00
// url
2021-09-02 16:40:35 +00:00
if ( is_ecs ) {
const char * env = std : : getenv ( ECS_IAM_ENV_VAR ) ;
if ( env = = NULL ) {
S3FS_PRN_ERR ( " %s is not set. " , ECS_IAM_ENV_VAR ) ;
return - EIO ;
}
url = std : : string ( S3fsCurl : : IAM_cred_url ) + env ;
} else {
if ( S3fsCurl : : IAM_api_version > 1 ) {
int result = GetIAMv2ApiToken ( ) ;
if ( - ENOENT = = result ) {
// If we get a 404 back when requesting the token service,
// then it's highly likely we're running in an environment
// that doesn't support the AWS IMDSv2 API, so we'll skip
// the token retrieval in the future.
SetIMDSVersion ( 1 ) ;
} else if ( result ! = 0 ) {
// If we get an unexpected error when retrieving the API
// token, log it but continue. Requirement for including
// an API token with the metadata request may or may not
// be enforced, so we should not abort here.
S3FS_PRN_ERR ( " AWS IMDSv2 token retrieval failed: %d " , result ) ;
}
}
url = std : : string ( S3fsCurl : : IAM_cred_url ) ;
}
2020-08-22 12:40:53 +00:00
requestHeaders = NULL ;
responseHeaders . clear ( ) ;
bodydata . Clear ( ) ;
2016-05-06 04:37:32 +00:00
2021-09-02 16:40:35 +00:00
if ( S3fsCurl : : IAM_api_version > 1 ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , S3fsCurl : : IAMv2_token_hdr . c_str ( ) , S3fsCurl : : IAMv2_api_token . c_str ( ) ) ;
}
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return false ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return false ;
}
if ( ! S3fsCurl : : AddUserAgent ( hCurl ) ) { // put User-Agent
return false ;
}
2016-05-06 04:37:32 +00:00
2020-08-22 12:40:53 +00:00
int result = RequestPerform ( true ) ;
2016-05-06 04:37:32 +00:00
2020-08-22 12:40:53 +00:00
// analyzing response
if ( 0 = = result & & ! S3fsCurl : : SetIAMRoleFromMetaData ( bodydata . str ( ) ) ) {
S3FS_PRN_ERR ( " Something error occurred, could not get IAM role name. " ) ;
result = - EIO ;
}
bodydata . Clear ( ) ;
2016-05-06 04:37:32 +00:00
2020-08-22 12:40:53 +00:00
return ( 0 = = result ) ;
2016-05-06 04:37:32 +00:00
}
2020-09-14 08:47:21 +00:00
bool S3fsCurl : : AddSseRequestHead ( sse_type_t ssetype , const std : : string & input , bool is_only_c , bool is_copy )
2014-07-19 19:02:55 +00:00
{
2020-09-14 08:47:21 +00:00
std : : string ssevalue = input ;
2020-08-22 12:40:53 +00:00
switch ( ssetype ) {
case sse_type_t : : SSE_DISABLE :
return true ;
case sse_type_t : : SSE_S3 :
if ( ! is_only_c ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-server-side-encryption " , " AES256 " ) ;
}
return true ;
case sse_type_t : : SSE_C :
{
2020-09-11 09:37:24 +00:00
std : : string sseckey ;
2020-08-22 12:40:53 +00:00
if ( S3fsCurl : : GetSseKey ( ssevalue , sseckey ) ) {
if ( is_copy ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-copy-source-server-side-encryption-customer-algorithm " , " AES256 " ) ;
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-copy-source-server-side-encryption-customer-key " , sseckey . c_str ( ) ) ;
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-copy-source-server-side-encryption-customer-key-md5 " , ssevalue . c_str ( ) ) ;
} else {
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-server-side-encryption-customer-algorithm " , " AES256 " ) ;
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-server-side-encryption-customer-key " , sseckey . c_str ( ) ) ;
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-server-side-encryption-customer-key-md5 " , ssevalue . c_str ( ) ) ;
}
} else {
S3FS_PRN_WARN ( " Failed to insert SSE-C header. " ) ;
}
return true ;
}
case sse_type_t : : SSE_KMS :
if ( ! is_only_c ) {
if ( ssevalue . empty ( ) ) {
ssevalue = S3fsCurl : : GetSseKmsId ( ) ;
}
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-server-side-encryption " , " aws:kms " ) ;
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-server-side-encryption-aws-kms-key-id " , ssevalue . c_str ( ) ) ;
}
return true ;
}
S3FS_PRN_ERR ( " sse type is unknown(%d). " , static_cast < int > ( S3fsCurl : : ssetype ) ) ;
return false ;
2014-07-19 19:02:55 +00:00
}
2013-07-05 02:28:31 +00:00
//
// tpath : target path for head request
// bpath : saved into base_path
// savedpath : saved into saved_path
2015-10-06 14:46:14 +00:00
// ssekey_pos : -1 means "not" SSE-C type
// 0 - X means SSE-C type and position for SSE-C key(0 is latest key)
2013-07-05 02:28:31 +00:00
//
2021-06-13 03:50:07 +00:00
bool S3fsCurl : : PreHeadRequest ( const char * tpath , const char * bpath , const char * savedpath , size_t ssekey_pos )
2013-05-16 02:02:55 +00:00
{
2021-06-13 03:50:07 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s][bpath=%s][save=%s][sseckeypos=%zu] " , SAFESTRPTR ( tpath ) , SAFESTRPTR ( bpath ) , SAFESTRPTR ( savedpath ) , ssekey_pos ) ;
2013-05-16 02:02:55 +00:00
2020-08-22 12:40:53 +00:00
if ( ! tpath ) {
return false ;
}
2020-09-11 09:37:24 +00:00
std : : string resource ;
std : : string turl ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( get_realpath ( tpath ) . c_str ( ) , resource , turl ) ;
// libcurl 7.17 does deep copy of url, deep copy "stable" url
url = prepare_url ( turl . c_str ( ) ) ;
path = get_realpath ( tpath ) ;
base_path = SAFESTRPTR ( bpath ) ;
saved_path = SAFESTRPTR ( savedpath ) ;
requestHeaders = NULL ;
responseHeaders . clear ( ) ;
// requestHeaders
2021-06-13 03:50:07 +00:00
if ( 0 = = ssekey_pos ) {
2020-09-11 09:37:24 +00:00
std : : string md5 ;
2020-08-22 12:40:53 +00:00
if ( ! S3fsCurl : : GetSseKeyMd5 ( ssekey_pos , md5 ) | | ! AddSseRequestHead ( sse_type_t : : SSE_C , md5 , true , false ) ) {
2021-06-13 03:50:07 +00:00
S3FS_PRN_ERR ( " Failed to set SSE-C headers for sse-c key pos(%zu)(=md5(%s)). " , ssekey_pos , md5 . c_str ( ) ) ;
2020-08-22 12:40:53 +00:00
return false ;
}
2014-07-19 19:02:55 +00:00
}
2020-08-22 12:40:53 +00:00
b_ssekey_pos = ssekey_pos ;
2014-07-19 19:02:55 +00:00
2020-08-22 12:40:53 +00:00
op = " HEAD " ;
type = REQTYPE_HEAD ;
2013-05-16 02:02:55 +00:00
2020-08-22 12:40:53 +00:00
// set lazy function
fpLazySetup = PreHeadRequestSetCurlOpts ;
2013-05-16 02:02:55 +00:00
2020-08-22 12:40:53 +00:00
return true ;
2013-05-16 02:02:55 +00:00
}
2013-07-05 02:28:31 +00:00
int S3fsCurl : : HeadRequest ( const char * tpath , headers_t & meta )
2013-03-30 13:37:14 +00:00
{
2020-08-22 12:40:53 +00:00
int result = - 1 ;
S3FS_PRN_INFO3 ( " [tpath=%s] " , SAFESTRPTR ( tpath ) ) ;
// At first, try to get without SSE-C headers
if ( ! PreHeadRequest ( tpath ) | | ! fpLazySetup | | ! fpLazySetup ( this ) | | 0 ! = ( result = RequestPerform ( ) ) ) {
// If has SSE-C keys, try to get with all SSE-C keys.
2021-06-13 03:50:07 +00:00
for ( size_t pos = 0 ; pos < S3fsCurl : : sseckeys . size ( ) ; pos + + ) {
2020-08-22 12:40:53 +00:00
if ( ! DestroyCurlHandle ( ) ) {
break ;
}
if ( ! PreHeadRequest ( tpath , NULL , NULL , pos ) ) {
break ;
}
if ( ! fpLazySetup | | ! fpLazySetup ( this ) ) {
S3FS_PRN_ERR ( " Failed to lazy setup in single head request. " ) ;
break ;
}
if ( 0 = = ( result = RequestPerform ( ) ) ) {
break ;
}
}
if ( 0 ! = result ) {
DestroyCurlHandle ( ) ; // not check result.
return result ;
2015-11-01 13:54:47 +00:00
}
2017-05-06 02:15:53 +00:00
}
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
// file exists in s3
// fixme: clean this up.
meta . clear ( ) ;
for ( headers_t : : iterator iter = responseHeaders . begin ( ) ; iter ! = responseHeaders . end ( ) ; + + iter ) {
2020-09-11 09:37:24 +00:00
std : : string key = lower ( iter - > first ) ;
std : : string value = iter - > second ;
2020-08-22 12:40:53 +00:00
if ( key = = " content-type " ) {
meta [ iter - > first ] = value ;
} else if ( key = = " content-length " ) {
meta [ iter - > first ] = value ;
} else if ( key = = " etag " ) {
meta [ iter - > first ] = value ;
} else if ( key = = " last-modified " ) {
meta [ iter - > first ] = value ;
2020-09-26 05:09:20 +00:00
} else if ( is_prefix ( key . c_str ( ) , " x-amz " ) ) {
2020-08-22 12:40:53 +00:00
meta [ key ] = value ; // key is lower case for "x-amz"
}
}
return 0 ;
2013-03-30 13:37:14 +00:00
}
2020-08-22 12:40:53 +00:00
int S3fsCurl : : PutHeadRequest ( const char * tpath , headers_t & meta , bool is_copy )
2013-03-30 13:37:14 +00:00
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s] " , SAFESTRPTR ( tpath ) ) ;
2013-03-30 13:37:14 +00:00
2020-08-22 12:40:53 +00:00
if ( ! tpath ) {
2021-01-18 09:50:49 +00:00
return - EINVAL ;
2020-08-22 12:40:53 +00:00
}
if ( ! CreateCurlHandle ( ) ) {
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2020-09-11 09:37:24 +00:00
std : : string resource ;
std : : string turl ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( get_realpath ( tpath ) . c_str ( ) , resource , turl ) ;
2013-03-30 13:37:14 +00:00
2020-08-22 12:40:53 +00:00
url = prepare_url ( turl . c_str ( ) ) ;
path = get_realpath ( tpath ) ;
requestHeaders = NULL ;
responseHeaders . clear ( ) ;
bodydata . Clear ( ) ;
2020-09-11 09:37:24 +00:00
std : : string contype = S3fsCurl : : LookupMimeType ( std : : string ( tpath ) ) ;
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Content-Type " , contype . c_str ( ) ) ;
// Make request headers
for ( headers_t : : iterator iter = meta . begin ( ) ; iter ! = meta . end ( ) ; + + iter ) {
2020-09-11 09:37:24 +00:00
std : : string key = lower ( iter - > first ) ;
std : : string value = iter - > second ;
2020-09-26 05:09:20 +00:00
if ( is_prefix ( key . c_str ( ) , " x-amz-acl " ) ) {
2020-08-22 12:40:53 +00:00
// not set value, but after set it.
2020-09-26 05:09:20 +00:00
} else if ( is_prefix ( key . c_str ( ) , " x-amz-meta " ) ) {
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , iter - > first . c_str ( ) , value . c_str ( ) ) ;
} else if ( key = = " x-amz-copy-source " ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , iter - > first . c_str ( ) , value . c_str ( ) ) ;
} else if ( key = = " x-amz-server-side-encryption " & & value ! = " aws:kms " ) {
// Only copy mode.
if ( is_copy & & ! AddSseRequestHead ( sse_type_t : : SSE_S3 , value , false , true ) ) {
S3FS_PRN_WARN ( " Failed to insert SSE-S3 header. " ) ;
}
} else if ( key = = " x-amz-server-side-encryption-aws-kms-key-id " ) {
// Only copy mode.
if ( is_copy & & ! value . empty ( ) & & ! AddSseRequestHead ( sse_type_t : : SSE_KMS , value , false , true ) ) {
S3FS_PRN_WARN ( " Failed to insert SSE-KMS header. " ) ;
}
} else if ( key = = " x-amz-server-side-encryption-customer-key-md5 " ) {
// Only copy mode.
if ( is_copy ) {
if ( ! AddSseRequestHead ( sse_type_t : : SSE_C , value , true , true ) | | ! AddSseRequestHead ( sse_type_t : : SSE_C , value , true , false ) ) {
S3FS_PRN_WARN ( " Failed to insert SSE-C header. " ) ;
}
}
}
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
// "x-amz-acl", storage class, sse
if ( S3fsCurl : : default_acl ! = acl_t : : PRIVATE ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-acl " , S3fsCurl : : default_acl . str ( ) ) ;
}
2021-05-21 14:34:31 +00:00
if ( strcasecmp ( GetStorageClass ( ) . c_str ( ) , " STANDARD " ) ! = 0 ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-storage-class " , GetStorageClass ( ) . c_str ( ) ) ;
2020-08-22 12:40:53 +00:00
}
// SSE
if ( ! is_copy ) {
2020-09-11 09:37:24 +00:00
std : : string ssevalue ;
2020-08-22 12:40:53 +00:00
if ( ! AddSseRequestHead ( S3fsCurl : : GetSseType ( ) , ssevalue , false , false ) ) {
S3FS_PRN_WARN ( " Failed to set SSE header, but continue... " ) ;
}
}
if ( is_use_ahbe ) {
// set additional header by ahbe conf
requestHeaders = AdditionalHeader : : get ( ) - > AddHeader ( requestHeaders , tpath ) ;
}
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
op = " PUT " ;
type = REQTYPE_PUTHEAD ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
// setopt
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_UPLOAD , true ) ) { // HTTP PUT
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_INFILESIZE , 0 ) ) { // Content-Length
return - EIO ;
}
if ( ! S3fsCurl : : AddUserAgent ( hCurl ) ) { // put User-Agent
return - EIO ;
}
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " copying... [path=%s] " , tpath ) ;
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
2020-08-22 12:40:53 +00:00
int result = RequestPerform ( ) ;
2021-04-25 04:18:11 +00:00
result = MapPutErrorResponse ( result ) ;
2020-08-22 12:40:53 +00:00
bodydata . Clear ( ) ;
2015-10-06 14:46:14 +00:00
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
return result ;
2020-08-22 12:40:53 +00:00
}
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
int S3fsCurl : : PutRequest ( const char * tpath , headers_t & meta , int fd )
{
struct stat st ;
FILE * file = NULL ;
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s] " , SAFESTRPTR ( tpath ) ) ;
Changes codes for performance(part 3)
* Summay
This revision includes big change about temporary file and local cache file.
By this big change, s3fs works with good performance when s3fs opens/
closes/syncs/reads object.
I made a big change about the handling about temporary file and local cache
file to do this implementation.
* Detail
1) About temporary file(local file)
s3fs uses a temporary file on local file system when s3fs does download/
upload/open/seek object on S3.
After this revision, s3fs calls ftruncate() function when s3fs makes the
temporary file.
In this way s3fs can set a file size of precisely length without downloading.
(Notice - ftruncate function is for XSI-compliant systems, so that possibly
you have a problem on non-XSI-compliant systems.)
By this change, s3fs can download a part of a object by requesting with
"Range" http header. It seems like downloading by each block unit.
The default block(part) size is 50MB, it is caused the result which is default
parallel requests count(5) by default multipart upload size(10MB).
If you need to change this block size, you can change by new option
"fd_page_size". This option can take from 1MB(1024 * 1024) to any bytes.
So that, you have to take care about that fdcache.cpp(and fdcache.h) were
changed a lot.
2) About local cache
Local cache files which are in directory specified by "use_cache" option do
not have always all of object data.
This cause is that s3fs uses ftruncate function and reads(writes) each block
unit of a temporary file.
s3fs manages each block unit's status which are "downloaded area" or "not".
For this status, s3fs makes new temporary file in cache directory which is
specified by "use_cache" option. This status files is in a directory which is
named "<use_cache sirectory>/.<bucket_name>/".
When s3fs opens this status file, s3fs locks this file for exclusive control by
calling flock function. You need to take care about this, the status files can
not be laid on network drive(like NFS).
This revision changes about file open mode, s3fs always opens a local cache
file and each status file with writable mode.
Last, this revision adds new option "del_cache", this option means that s3fs
deletes all local cache file when s3fs starts and exits.
3) Uploading
When s3fs writes data to file descriptor through FUSE request, old s3fs
revision downloads all of the object. But new revision does not download all,
it downloads only small percial area(some block units) including writing data
area.
And when s3fs closes or flushes the file descriptor, s3fs downloads other area
which is not downloaded from server. After that, s3fs uploads all of data.
Already r456 revision has parallel upload function, then this revision with
r456 and r457 are very big change for performance.
4) Downloading
By changing a temporary file and a local cache file, when s3fs downloads a
object, it downloads only the required range(some block units).
And s3fs downloads units by parallel GET request, it is same as a case of
uploading. (Maximum parallel request count and each download size are
specified same parameters for uploading.)
In the new revision, when s3fs opens file, s3fs returns file descriptor soon.
Because s3fs only opens(makes) the file descriptor with no downloading
data. And when s3fs reads a data, s3fs downloads only some block unit
including specified area.
This result is good for performance.
5) Changes option name
The option "parallel_upload" which added at r456 is changed to new option
name as "parallel_count". This reason is this option value is not only used by
uploading object, but a uploading object also uses this option. (For a while,
you can use old option name "parallel_upload" for compatibility.)
git-svn-id: http://s3fs.googlecode.com/svn/trunk@458 df820570-a93a-0410-bd06-b72b767a4274
2013-07-23 16:01:48 +00:00
2020-08-22 12:40:53 +00:00
if ( ! tpath ) {
2021-01-18 09:50:49 +00:00
return - EINVAL ;
2020-08-22 12:40:53 +00:00
}
if ( - 1 ! = fd ) {
// duplicate fd
int fd2 ;
if ( - 1 = = ( fd2 = dup ( fd ) ) | | - 1 = = fstat ( fd2 , & st ) | | 0 ! = lseek ( fd2 , 0 , SEEK_SET ) | | NULL = = ( file = fdopen ( fd2 , " rb " ) ) ) {
S3FS_PRN_ERR ( " Could not duplicate file descriptor(errno=%d) " , errno ) ;
if ( - 1 ! = fd2 ) {
close ( fd2 ) ;
}
return - errno ;
}
b_infile = file ;
} else {
// This case is creating zero byte object.(calling by create_file_object())
S3FS_PRN_INFO3 ( " create zero byte file object. " ) ;
}
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
if ( ! CreateCurlHandle ( ) ) {
if ( file ) {
fclose ( file ) ;
}
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2020-09-11 09:37:24 +00:00
std : : string resource ;
std : : string turl ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( get_realpath ( tpath ) . c_str ( ) , resource , turl ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
url = prepare_url ( turl . c_str ( ) ) ;
path = get_realpath ( tpath ) ;
requestHeaders = NULL ;
responseHeaders . clear ( ) ;
bodydata . Clear ( ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
// Make request headers
2020-09-11 09:37:24 +00:00
std : : string strMD5 ;
2021-06-18 19:29:26 +00:00
if ( S3fsCurl : : is_content_md5 ) {
if ( - 1 ! = fd ) {
strMD5 = s3fs_get_content_md5 ( fd ) ;
if ( 0 = = strMD5 . length ( ) ) {
S3FS_PRN_ERR ( " Failed to make MD5. " ) ;
return - EIO ;
}
} else {
2021-08-02 15:10:27 +00:00
strMD5 = EMPTY_MD5_BASE64_HASH ;
2021-01-25 10:08:14 +00:00
}
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Content-MD5 " , strMD5 . c_str ( ) ) ;
}
2020-09-11 09:37:24 +00:00
std : : string contype = S3fsCurl : : LookupMimeType ( std : : string ( tpath ) ) ;
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Content-Type " , contype . c_str ( ) ) ;
for ( headers_t : : iterator iter = meta . begin ( ) ; iter ! = meta . end ( ) ; + + iter ) {
2020-09-11 09:37:24 +00:00
std : : string key = lower ( iter - > first ) ;
std : : string value = iter - > second ;
2020-09-26 05:09:20 +00:00
if ( is_prefix ( key . c_str ( ) , " x-amz-acl " ) ) {
2020-08-22 12:40:53 +00:00
// not set value, but after set it.
2020-09-26 05:09:20 +00:00
} else if ( is_prefix ( key . c_str ( ) , " x-amz-meta " ) ) {
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , iter - > first . c_str ( ) , value . c_str ( ) ) ;
} else if ( key = = " x-amz-server-side-encryption " & & value ! = " aws:kms " ) {
// skip this header, because this header is specified after logic.
} else if ( key = = " x-amz-server-side-encryption-aws-kms-key-id " ) {
// skip this header, because this header is specified after logic.
} else if ( key = = " x-amz-server-side-encryption-customer-key-md5 " ) {
// skip this header, because this header is specified after logic.
2015-11-01 13:54:47 +00:00
}
2020-08-22 12:40:53 +00:00
}
// "x-amz-acl", storage class, sse
if ( S3fsCurl : : default_acl ! = acl_t : : PRIVATE ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-acl " , S3fsCurl : : default_acl . str ( ) ) ;
}
2021-05-21 14:34:31 +00:00
if ( strcasecmp ( GetStorageClass ( ) . c_str ( ) , " STANDARD " ) ! = 0 ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-storage-class " , GetStorageClass ( ) . c_str ( ) ) ;
2020-08-22 12:40:53 +00:00
}
// SSE
2020-09-11 09:37:24 +00:00
std : : string ssevalue ;
2020-09-26 04:19:52 +00:00
// do not add SSE for create bucket
if ( 0 ! = strcmp ( tpath , " / " ) ) {
if ( ! AddSseRequestHead ( S3fsCurl : : GetSseType ( ) , ssevalue , false , false ) ) {
S3FS_PRN_WARN ( " Failed to set SSE header, but continue... " ) ;
}
2020-08-22 12:40:53 +00:00
}
if ( is_use_ahbe ) {
// set additional header by ahbe conf
requestHeaders = AdditionalHeader : : get ( ) - > AddHeader ( requestHeaders , tpath ) ;
}
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
op = " PUT " ;
type = REQTYPE_PUT ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
// setopt
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_UPLOAD , true ) ) { // HTTP PUT
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return - EIO ;
}
2020-08-22 12:40:53 +00:00
if ( file ) {
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_INFILESIZE_LARGE , static_cast < curl_off_t > ( st . st_size ) ) ) { // Content-Length
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_INFILE , file ) ) {
return - EIO ;
}
2020-08-22 12:40:53 +00:00
} else {
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_INFILESIZE , 0 ) ) { // Content-Length: 0
return - EIO ;
}
}
if ( ! S3fsCurl : : AddUserAgent ( hCurl ) ) { // put User-Agent
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " uploading... [path=%s][fd=%d][size=%lld] " , tpath , fd , static_cast < long long int > ( - 1 ! = fd ? st . st_size : 0 ) ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
int result = RequestPerform ( ) ;
2021-04-25 04:18:11 +00:00
result = MapPutErrorResponse ( result ) ;
2020-08-22 12:40:53 +00:00
bodydata . Clear ( ) ;
if ( file ) {
fclose ( file ) ;
}
return result ;
2013-07-05 02:28:31 +00:00
}
2020-09-14 08:47:21 +00:00
int S3fsCurl : : PreGetObjectRequest ( const char * tpath , int fd , off_t start , off_t size , sse_type_t ssetype , const std : : string & ssevalue )
2013-11-11 13:45:35 +00:00
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s][start=%lld][size=%lld] " , SAFESTRPTR ( tpath ) , static_cast < long long > ( start ) , static_cast < long long > ( size ) ) ;
2013-11-11 13:45:35 +00:00
2020-08-22 12:40:53 +00:00
if ( ! tpath | | - 1 = = fd | | 0 > start | | 0 > size ) {
2021-01-18 09:50:49 +00:00
return - EINVAL ;
2020-08-22 12:40:53 +00:00
}
2013-07-12 00:33:36 +00:00
2020-09-11 09:37:24 +00:00
std : : string resource ;
std : : string turl ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( get_realpath ( tpath ) . c_str ( ) , resource , turl ) ;
url = prepare_url ( turl . c_str ( ) ) ;
path = get_realpath ( tpath ) ;
requestHeaders = NULL ;
responseHeaders . clear ( ) ;
2021-04-12 22:36:09 +00:00
if ( 0 < size ) {
2020-09-11 09:37:24 +00:00
std : : string range = " bytes= " ;
2020-08-22 12:40:53 +00:00
range + = str ( start ) ;
range + = " - " ;
range + = str ( start + size - 1 ) ;
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Range " , range . c_str ( ) ) ;
}
// SSE
if ( ! AddSseRequestHead ( ssetype , ssevalue , true , false ) ) {
S3FS_PRN_WARN ( " Failed to set SSE header, but continue... " ) ;
2017-05-07 09:29:08 +00:00
}
2020-08-22 12:40:53 +00:00
op = " GET " ;
type = REQTYPE_GET ;
2015-01-20 16:31:36 +00:00
2020-08-22 12:40:53 +00:00
// set lazy function
fpLazySetup = PreGetObjectRequestSetCurlOpts ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
// set info for callback func.
// (use only fd, startpos and size, other member is not used.)
partdata . clear ( ) ;
partdata . fd = fd ;
partdata . startpos = start ;
partdata . size = size ;
b_partdata_startpos = start ;
b_partdata_size = size ;
b_ssetype = ssetype ;
b_ssevalue = ssevalue ;
b_ssekey_pos = - 1 ; // not use this value for get object.
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
return 0 ;
2013-07-10 06:24:06 +00:00
}
2020-09-14 09:09:25 +00:00
int S3fsCurl : : GetObjectRequest ( const char * tpath , int fd , off_t start , off_t size )
2013-07-10 06:24:06 +00:00
{
2020-08-22 12:40:53 +00:00
int result ;
2013-07-10 06:24:06 +00:00
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s][start=%lld][size=%lld] " , SAFESTRPTR ( tpath ) , static_cast < long long > ( start ) , static_cast < long long > ( size ) ) ;
2013-07-10 06:24:06 +00:00
2020-08-22 12:40:53 +00:00
if ( ! tpath ) {
2021-01-18 09:50:49 +00:00
return - EINVAL ;
2020-08-22 12:40:53 +00:00
}
sse_type_t ssetype = sse_type_t : : SSE_DISABLE ;
2020-09-11 09:37:24 +00:00
std : : string ssevalue ;
2020-08-22 12:40:53 +00:00
if ( ! get_object_sse_type ( tpath , ssetype , ssevalue ) ) {
S3FS_PRN_WARN ( " Failed to get SSE type for file(%s). " , SAFESTRPTR ( tpath ) ) ;
}
2013-07-10 06:24:06 +00:00
2020-08-22 12:40:53 +00:00
if ( 0 ! = ( result = PreGetObjectRequest ( tpath , fd , start , size , ssetype , ssevalue ) ) ) {
return result ;
}
if ( ! fpLazySetup | | ! fpLazySetup ( this ) ) {
S3FS_PRN_ERR ( " Failed to lazy setup in single get object request. " ) ;
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2019-07-14 12:09:37 +00:00
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " downloading... [path=%s][fd=%d] " , tpath , fd ) ;
2013-11-11 13:45:35 +00:00
2020-08-22 12:40:53 +00:00
result = RequestPerform ( ) ;
partdata . clear ( ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
return result ;
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
int S3fsCurl : : CheckBucket ( )
2013-07-05 02:28:31 +00:00
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " check a bucket. " ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
if ( ! CreateCurlHandle ( ) ) {
2021-01-18 09:50:49 +00:00
return - EIO ;
2017-05-07 09:29:08 +00:00
}
2021-02-23 00:45:13 +00:00
std : : string urlargs ;
if ( S3fsCurl : : IsListObjectsV2 ( ) ) {
query_string = " list-type=2 " ;
urlargs = " ? " + query_string ;
}
2020-09-11 09:37:24 +00:00
std : : string resource ;
std : : string turl ;
2021-07-25 03:29:00 +00:00
MakeUrlResource ( " / " , resource , turl ) ;
2017-04-02 07:27:43 +00:00
2021-02-23 00:45:13 +00:00
turl + = urlargs ;
2020-08-22 12:40:53 +00:00
url = prepare_url ( turl . c_str ( ) ) ;
2021-07-25 03:29:00 +00:00
path = " / " ; // Only check the presence of the bucket, not the entire virtual path.
2020-08-22 12:40:53 +00:00
requestHeaders = NULL ;
responseHeaders . clear ( ) ;
bodydata . Clear ( ) ;
2017-04-02 07:27:43 +00:00
2020-08-22 12:40:53 +00:00
op = " GET " ;
type = REQTYPE_CHKBUCKET ;
// setopt
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_UNRESTRICTED_AUTH , 1L ) ) {
return - EIO ;
}
if ( ! S3fsCurl : : AddUserAgent ( hCurl ) ) { // put User-Agent
return - EIO ;
}
2019-01-29 07:39:11 +00:00
2020-08-22 12:40:53 +00:00
int result = RequestPerform ( ) ;
if ( result ! = 0 ) {
S3FS_PRN_ERR ( " Check bucket failed, S3 response: %s " , bodydata . str ( ) ) ;
}
return result ;
2019-01-29 07:39:11 +00:00
}
2020-08-22 12:40:53 +00:00
int S3fsCurl : : ListBucketRequest ( const char * tpath , const char * query )
2019-01-29 07:39:11 +00:00
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s] " , SAFESTRPTR ( tpath ) ) ;
2019-01-29 07:39:11 +00:00
2020-08-22 12:40:53 +00:00
if ( ! tpath ) {
2021-01-18 09:50:49 +00:00
return - EINVAL ;
2020-08-22 12:40:53 +00:00
}
if ( ! CreateCurlHandle ( ) ) {
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2020-09-11 09:37:24 +00:00
std : : string resource ;
std : : string turl ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( " " , resource , turl ) ; // NOTICE: path is "".
if ( query ) {
turl + = " ? " ;
turl + = query ;
query_string = query ;
}
2019-01-29 07:39:11 +00:00
2020-08-22 12:40:53 +00:00
url = prepare_url ( turl . c_str ( ) ) ;
path = get_realpath ( tpath ) ;
requestHeaders = NULL ;
responseHeaders . clear ( ) ;
bodydata . Clear ( ) ;
2019-01-29 07:39:11 +00:00
2020-08-22 12:40:53 +00:00
op = " GET " ;
type = REQTYPE_LISTBUCKET ;
// setopt
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return - EIO ;
}
2020-08-22 12:40:53 +00:00
if ( S3fsCurl : : is_verbose ) {
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_DEBUGFUNCTION , S3fsCurl : : CurlDebugBodyInFunc ) ) { // replace debug function
return - EIO ;
}
}
if ( ! S3fsCurl : : AddUserAgent ( hCurl ) ) { // put User-Agent
return - EIO ;
2020-08-22 12:40:53 +00:00
}
return RequestPerform ( ) ;
2019-09-26 02:30:58 +00:00
}
2020-08-22 12:40:53 +00:00
//
// Initialize multipart upload
//
// Example :
// POST /example-object?uploads HTTP/1.1
// Host: example-bucket.s3.amazonaws.com
// Date: Mon, 1 Nov 2010 20:34:56 GMT
// Authorization: AWS VGhpcyBtZXNzYWdlIHNpZ25lZCBieSBlbHZpbmc=
//
2020-09-11 09:37:24 +00:00
int S3fsCurl : : PreMultipartPostRequest ( const char * tpath , headers_t & meta , std : : string & upload_id , bool is_copy )
2013-07-05 02:28:31 +00:00
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s] " , SAFESTRPTR ( tpath ) ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
if ( ! tpath ) {
2021-01-18 09:50:49 +00:00
return - EINVAL ;
2020-08-22 12:40:53 +00:00
}
if ( ! CreateCurlHandle ( ) ) {
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2020-09-11 09:37:24 +00:00
std : : string resource ;
std : : string turl ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( get_realpath ( tpath ) . c_str ( ) , resource , turl ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
query_string = " uploads " ;
turl + = " ? " + query_string ;
url = prepare_url ( turl . c_str ( ) ) ;
path = get_realpath ( tpath ) ;
requestHeaders = NULL ;
bodydata . Clear ( ) ;
responseHeaders . clear ( ) ;
2020-09-11 09:37:24 +00:00
std : : string contype = S3fsCurl : : LookupMimeType ( std : : string ( tpath ) ) ;
2020-08-22 12:40:53 +00:00
for ( headers_t : : iterator iter = meta . begin ( ) ; iter ! = meta . end ( ) ; + + iter ) {
2020-09-11 09:37:24 +00:00
std : : string key = lower ( iter - > first ) ;
std : : string value = iter - > second ;
2020-09-26 05:09:20 +00:00
if ( is_prefix ( key . c_str ( ) , " x-amz-acl " ) ) {
2020-08-22 12:40:53 +00:00
// not set value, but after set it.
2020-09-26 05:09:20 +00:00
} else if ( is_prefix ( key . c_str ( ) , " x-amz-meta " ) ) {
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , iter - > first . c_str ( ) , value . c_str ( ) ) ;
} else if ( key = = " x-amz-server-side-encryption " & & value ! = " aws:kms " ) {
// Only copy mode.
if ( is_copy & & ! AddSseRequestHead ( sse_type_t : : SSE_S3 , value , false , true ) ) {
S3FS_PRN_WARN ( " Failed to insert SSE-S3 header. " ) ;
}
} else if ( key = = " x-amz-server-side-encryption-aws-kms-key-id " ) {
// Only copy mode.
if ( is_copy & & ! value . empty ( ) & & ! AddSseRequestHead ( sse_type_t : : SSE_KMS , value , false , true ) ) {
S3FS_PRN_WARN ( " Failed to insert SSE-KMS header. " ) ;
}
} else if ( key = = " x-amz-server-side-encryption-customer-key-md5 " ) {
// Only copy mode.
if ( is_copy ) {
if ( ! AddSseRequestHead ( sse_type_t : : SSE_C , value , true , true ) | | ! AddSseRequestHead ( sse_type_t : : SSE_C , value , true , false ) ) {
S3FS_PRN_WARN ( " Failed to insert SSE-C header. " ) ;
}
}
}
}
// "x-amz-acl", storage class, sse
if ( S3fsCurl : : default_acl ! = acl_t : : PRIVATE ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-acl " , S3fsCurl : : default_acl . str ( ) ) ;
}
2021-05-21 14:34:31 +00:00
if ( strcasecmp ( GetStorageClass ( ) . c_str ( ) , " STANDARD " ) ! = 0 ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , " x-amz-storage-class " , GetStorageClass ( ) . c_str ( ) ) ;
2020-08-22 12:40:53 +00:00
}
// SSE
if ( ! is_copy ) {
2020-09-11 09:37:24 +00:00
std : : string ssevalue ;
2020-08-22 12:40:53 +00:00
if ( ! AddSseRequestHead ( S3fsCurl : : GetSseType ( ) , ssevalue , false , false ) ) {
S3FS_PRN_WARN ( " Failed to set SSE header, but continue... " ) ;
}
}
if ( is_use_ahbe ) {
// set additional header by ahbe conf
requestHeaders = AdditionalHeader : : get ( ) - > AddHeader ( requestHeaders , tpath ) ;
}
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Accept " , NULL ) ;
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Content-Type " , contype . c_str ( ) ) ;
op = " POST " ;
type = REQTYPE_PREMULTIPOST ;
// setopt
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_POST , true ) ) { // POST
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_POSTFIELDSIZE , 0 ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_INFILESIZE , 0 ) ) { // Content-Length
return - EIO ;
}
if ( ! S3fsCurl : : AddUserAgent ( hCurl ) ) { // put User-Agent
return - EIO ;
}
2020-08-22 12:40:53 +00:00
// request
int result ;
if ( 0 ! = ( result = RequestPerform ( ) ) ) {
bodydata . Clear ( ) ;
return result ;
}
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
if ( ! simple_parse_xml ( bodydata . str ( ) , bodydata . size ( ) , " UploadId " , upload_id ) ) {
bodydata . Clear ( ) ;
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2019-08-11 15:45:57 +00:00
2020-08-22 12:40:53 +00:00
bodydata . Clear ( ) ;
return 0 ;
}
2013-07-05 02:28:31 +00:00
2020-09-11 09:37:24 +00:00
int S3fsCurl : : CompleteMultipartPostRequest ( const char * tpath , const std : : string & upload_id , etaglist_t & parts )
2020-08-22 12:40:53 +00:00
{
S3FS_PRN_INFO3 ( " [tpath=%s][parts=%zu] " , SAFESTRPTR ( tpath ) , parts . size ( ) ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
if ( ! tpath ) {
2021-01-18 09:50:49 +00:00
return - EINVAL ;
2020-08-22 12:40:53 +00:00
}
2019-08-11 15:45:57 +00:00
2020-08-22 12:40:53 +00:00
// make contents
2021-04-13 00:24:40 +00:00
std : : string postContent ;
postContent + = " <CompleteMultipartUpload> \n " ;
2021-08-15 14:34:21 +00:00
for ( etaglist_t : : iterator it = parts . begin ( ) ; it ! = parts . end ( ) ; + + it ) {
if ( it - > etag . empty ( ) ) {
S3FS_PRN_ERR ( " %d file part is not finished uploading. " , it - > part_num ) ;
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2021-04-13 00:24:40 +00:00
postContent + = " <Part> \n " ;
2021-08-15 14:34:21 +00:00
postContent + = " <PartNumber> " + str ( it - > part_num ) + " </PartNumber> \n " ;
postContent + = " <ETag> " + it - > etag + " </ETag> \n " ;
2021-04-13 00:24:40 +00:00
postContent + = " </Part> \n " ;
2013-07-05 02:28:31 +00:00
}
2021-04-13 00:24:40 +00:00
postContent + = " </CompleteMultipartUpload> \n " ;
2019-08-11 15:45:57 +00:00
2020-08-22 12:40:53 +00:00
// set postdata
2021-04-13 00:24:40 +00:00
postdata = reinterpret_cast < const unsigned char * > ( postContent . c_str ( ) ) ;
2020-08-22 12:40:53 +00:00
b_postdata = postdata ;
2021-04-13 00:24:40 +00:00
postdata_remaining = postContent . size ( ) ; // without null
2020-08-22 12:40:53 +00:00
b_postdata_remaining = postdata_remaining ;
if ( ! CreateCurlHandle ( ) ) {
2021-04-13 00:24:40 +00:00
postdata = NULL ;
b_postdata = NULL ;
2021-01-18 09:50:49 +00:00
return - EIO ;
2019-08-11 15:45:57 +00:00
}
2020-09-11 09:37:24 +00:00
std : : string resource ;
std : : string turl ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( get_realpath ( tpath ) . c_str ( ) , resource , turl ) ;
query_string = " uploadId= " + upload_id ;
turl + = " ? " + query_string ;
url = prepare_url ( turl . c_str ( ) ) ;
path = get_realpath ( tpath ) ;
requestHeaders = NULL ;
bodydata . Clear ( ) ;
responseHeaders . clear ( ) ;
2020-09-11 09:37:24 +00:00
std : : string contype = " application/xml " ;
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Accept " , NULL ) ;
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Content-Type " , contype . c_str ( ) ) ;
2019-08-11 15:45:57 +00:00
2020-08-22 12:40:53 +00:00
op = " POST " ;
type = REQTYPE_COMPLETEMULTIPOST ;
2019-08-11 15:45:57 +00:00
2020-08-22 12:40:53 +00:00
// setopt
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_POST , true ) ) { // POST
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_POSTFIELDSIZE , static_cast < curl_off_t > ( postdata_remaining ) ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_READDATA , ( void * ) this ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_READFUNCTION , S3fsCurl : : ReadCallback ) ) {
return - EIO ;
}
2020-08-22 12:40:53 +00:00
if ( S3fsCurl : : is_verbose ) {
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_DEBUGFUNCTION , S3fsCurl : : CurlDebugBodyOutFunc ) ) { // replace debug function
return - EIO ;
}
}
if ( ! S3fsCurl : : AddUserAgent ( hCurl ) ) { // put User-Agent
return - EIO ;
2019-08-11 15:45:57 +00:00
}
2020-08-22 12:40:53 +00:00
// request
int result = RequestPerform ( ) ;
bodydata . Clear ( ) ;
2021-04-13 00:24:40 +00:00
postdata = NULL ;
b_postdata = NULL ;
2013-07-05 02:28:31 +00:00
return result ;
}
2020-09-11 09:37:24 +00:00
int S3fsCurl : : MultipartListRequest ( std : : string & body )
2013-07-05 02:28:31 +00:00
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " list request(multipart) " ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
if ( ! CreateCurlHandle ( ) ) {
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2020-09-11 09:37:24 +00:00
std : : string resource ;
std : : string turl ;
2020-08-22 12:40:53 +00:00
path = get_realpath ( " / " ) ;
MakeUrlResource ( path . c_str ( ) , resource , turl ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
query_string = " uploads " ;
turl + = " ? " + query_string ;
url = prepare_url ( turl . c_str ( ) ) ;
requestHeaders = NULL ;
responseHeaders . clear ( ) ;
bodydata . Clear ( ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Accept " , NULL ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
op = " GET " ;
type = REQTYPE_MULTILIST ;
// setopt
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEDATA , ( void * ) & bodydata ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_WRITEFUNCTION , WriteMemoryCallback ) ) {
return - EIO ;
}
if ( ! S3fsCurl : : AddUserAgent ( hCurl ) ) { // put User-Agent
return - EIO ;
}
2020-08-22 12:40:53 +00:00
int result ;
if ( 0 = = ( result = RequestPerform ( ) ) & & 0 < bodydata . size ( ) ) {
body = bodydata . str ( ) ;
} else {
body = " " ;
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
bodydata . Clear ( ) ;
2013-07-05 02:28:31 +00:00
return result ;
2013-03-30 13:37:14 +00:00
}
2020-09-11 09:37:24 +00:00
int S3fsCurl : : AbortMultipartUpload ( const char * tpath , const std : : string & upload_id )
2015-10-18 17:03:41 +00:00
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s] " , SAFESTRPTR ( tpath ) ) ;
if ( ! tpath ) {
2021-01-18 09:50:49 +00:00
return - EINVAL ;
2020-08-22 12:40:53 +00:00
}
if ( ! CreateCurlHandle ( ) ) {
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2020-09-11 09:37:24 +00:00
std : : string resource ;
std : : string turl ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( get_realpath ( tpath ) . c_str ( ) , resource , turl ) ;
query_string = " uploadId= " + upload_id ;
turl + = " ? " + query_string ;
url = prepare_url ( turl . c_str ( ) ) ;
path = get_realpath ( tpath ) ;
requestHeaders = NULL ;
responseHeaders . clear ( ) ;
op = " DELETE " ;
type = REQTYPE_ABORTMULTIUPLOAD ;
2015-10-18 17:03:41 +00:00
2021-09-01 23:07:06 +00:00
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_URL , url . c_str ( ) ) ) {
return - EIO ;
}
if ( CURLE_OK ! = curl_easy_setopt ( hCurl , CURLOPT_CUSTOMREQUEST , " DELETE " ) ) {
return - EIO ;
}
if ( ! S3fsCurl : : AddUserAgent ( hCurl ) ) { // put User-Agent
return - EIO ;
}
2020-08-22 12:40:53 +00:00
return RequestPerform ( ) ;
2015-10-18 17:03:41 +00:00
}
2020-08-22 12:40:53 +00:00
//
// PUT /ObjectName?partNumber=PartNumber&uploadId=UploadId HTTP/1.1
// Host: BucketName.s3.amazonaws.com
// Date: date
// Content-Length: Size
// Authorization: Signature
//
// PUT /my-movie.m2ts?partNumber=1&uploadId=VCVsb2FkIElEIGZvciBlbZZpbmcncyBteS1tb3ZpZS5tMnRzIHVwbG9hZR HTTP/1.1
// Host: example-bucket.s3.amazonaws.com
// Date: Mon, 1 Nov 2010 20:34:56 GMT
// Content-Length: 10485760
// Content-MD5: pUNXr/BjKK5G2UKvaRRrOA==
// Authorization: AWS VGhpcyBtZXNzYWdlIHNpZ25lZGGieSRlbHZpbmc=
//
2020-09-11 09:37:24 +00:00
int S3fsCurl : : UploadMultipartPostSetup ( const char * tpath , int part_num , const std : : string & upload_id )
2013-03-30 13:37:14 +00:00
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s][start=%lld][size=%lld][part=%d] " , SAFESTRPTR ( tpath ) , static_cast < long long int > ( partdata . startpos ) , static_cast < long long int > ( partdata . size ) , part_num ) ;
2013-03-30 13:37:14 +00:00
2020-08-22 12:40:53 +00:00
if ( - 1 = = partdata . fd | | - 1 = = partdata . startpos | | - 1 = = partdata . size ) {
2021-01-18 09:50:49 +00:00
return - EINVAL ;
2020-08-22 12:40:53 +00:00
}
requestHeaders = NULL ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
// make md5 and file pointer
if ( S3fsCurl : : is_content_md5 ) {
2020-09-14 10:00:09 +00:00
unsigned char * md5raw = s3fs_md5_fd ( partdata . fd , partdata . startpos , partdata . size ) ;
2020-08-22 12:40:53 +00:00
if ( md5raw = = NULL ) {
S3FS_PRN_ERR ( " Could not make md5 for file(part %d) " , part_num ) ;
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2021-08-03 22:28:51 +00:00
partdata . etag = s3fs_hex_lower ( md5raw , get_md5_digest_length ( ) ) ;
2020-08-22 12:40:53 +00:00
char * md5base64p = s3fs_base64 ( md5raw , get_md5_digest_length ( ) ) ;
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Content-MD5 " , md5base64p ) ;
delete [ ] md5base64p ;
delete [ ] md5raw ;
}
// make request
query_string = " partNumber= " + str ( part_num ) + " &uploadId= " + upload_id ;
2020-09-11 09:37:24 +00:00
std : : string urlargs = " ? " + query_string ;
std : : string resource ;
std : : string turl ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( get_realpath ( tpath ) . c_str ( ) , resource , turl ) ;
turl + = urlargs ;
url = prepare_url ( turl . c_str ( ) ) ;
path = get_realpath ( tpath ) ;
bodydata . Clear ( ) ;
headdata . Clear ( ) ;
responseHeaders . clear ( ) ;
// SSE
if ( sse_type_t : : SSE_C = = S3fsCurl : : GetSseType ( ) ) {
2020-09-11 09:37:24 +00:00
std : : string ssevalue ;
2020-08-22 12:40:53 +00:00
if ( ! AddSseRequestHead ( S3fsCurl : : GetSseType ( ) , ssevalue , false , false ) ) {
S3FS_PRN_WARN ( " Failed to set SSE header, but continue... " ) ;
}
}
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Accept " , NULL ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
op = " PUT " ;
type = REQTYPE_UPLOADMULTIPOST ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
// set lazy function
fpLazySetup = UploadMultipartPostSetCurlOpts ;
2019-01-29 07:39:11 +00:00
2020-08-22 12:40:53 +00:00
return 0 ;
}
2013-07-05 02:28:31 +00:00
2020-09-11 09:37:24 +00:00
int S3fsCurl : : UploadMultipartPostRequest ( const char * tpath , int part_num , const std : : string & upload_id )
2020-08-22 12:40:53 +00:00
{
int result ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s][start=%lld][size=%lld][part=%d] " , SAFESTRPTR ( tpath ) , static_cast < long long int > ( partdata . startpos ) , static_cast < long long int > ( partdata . size ) , part_num ) ;
2019-01-29 07:39:11 +00:00
2020-08-22 12:40:53 +00:00
// setup
if ( 0 ! = ( result = S3fsCurl : : UploadMultipartPostSetup ( tpath , part_num , upload_id ) ) ) {
return result ;
2013-07-05 02:28:31 +00:00
}
2019-01-29 07:39:11 +00:00
2020-08-22 12:40:53 +00:00
if ( ! fpLazySetup | | ! fpLazySetup ( this ) ) {
S3FS_PRN_ERR ( " Failed to lazy setup in multipart upload post request. " ) ;
2021-01-18 09:50:49 +00:00
return - EIO ;
2019-01-29 07:39:11 +00:00
}
2020-08-22 12:40:53 +00:00
// request
if ( 0 = = ( result = RequestPerform ( ) ) ) {
// UploadMultipartPostComplete returns true on success -> convert to 0
result = ! UploadMultipartPostComplete ( ) ;
2019-02-01 18:17:39 +00:00
}
2020-08-22 12:40:53 +00:00
// closing
bodydata . Clear ( ) ;
headdata . Clear ( ) ;
2013-07-05 02:28:31 +00:00
return result ;
}
2020-09-11 09:37:24 +00:00
int S3fsCurl : : CopyMultipartPostSetup ( const char * from , const char * to , int part_num , const std : : string & upload_id , headers_t & meta )
2013-07-05 02:28:31 +00:00
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [from=%s][to=%s][part=%d] " , SAFESTRPTR ( from ) , SAFESTRPTR ( to ) , part_num ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
if ( ! from | | ! to ) {
2021-01-18 09:50:49 +00:00
return - EINVAL ;
2020-08-22 12:40:53 +00:00
}
2020-09-11 09:37:24 +00:00
query_string = " partNumber= " + str ( part_num ) + " &uploadId= " + upload_id ;
std : : string urlargs = " ? " + query_string ;
std : : string resource ;
std : : string turl ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( get_realpath ( to ) . c_str ( ) , resource , turl ) ;
turl + = urlargs ;
url = prepare_url ( turl . c_str ( ) ) ;
path = get_realpath ( to ) ;
requestHeaders = NULL ;
responseHeaders . clear ( ) ;
bodydata . Clear ( ) ;
headdata . Clear ( ) ;
2020-09-11 09:37:24 +00:00
std : : string contype = S3fsCurl : : LookupMimeType ( std : : string ( to ) ) ;
2020-08-22 12:40:53 +00:00
requestHeaders = curl_slist_sort_insert ( requestHeaders , " Content-Type " , contype . c_str ( ) ) ;
// Make request headers
for ( headers_t : : iterator iter = meta . begin ( ) ; iter ! = meta . end ( ) ; + + iter ) {
2020-09-11 09:37:24 +00:00
std : : string key = lower ( iter - > first ) ;
std : : string value = iter - > second ;
2020-08-22 12:40:53 +00:00
if ( key = = " x-amz-copy-source " ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , iter - > first . c_str ( ) , value . c_str ( ) ) ;
} else if ( key = = " x-amz-copy-source-range " ) {
requestHeaders = curl_slist_sort_insert ( requestHeaders , iter - > first . c_str ( ) , value . c_str ( ) ) ;
}
// NOTICE: x-amz-acl, x-amz-server-side-encryption is not set!
}
op = " PUT " ;
type = REQTYPE_COPYMULTIPOST ;
// set lazy function
fpLazySetup = CopyMultipartPostSetCurlOpts ;
// request
S3FS_PRN_INFO3 ( " copying... [from=%s][to=%s][part=%d] " , from , to , part_num ) ;
return 0 ;
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
bool S3fsCurl : : UploadMultipartPostComplete ( )
2013-07-05 02:28:31 +00:00
{
2020-08-22 12:40:53 +00:00
headers_t : : iterator it = responseHeaders . find ( " ETag " ) ;
if ( it = = responseHeaders . end ( ) ) {
return false ;
2013-09-14 21:50:39 +00:00
}
2020-08-22 12:40:53 +00:00
// check etag(md5);
//
// The ETAG when using SSE_C and SSE_KMS does not reflect the MD5 we sent
// SSE_C: https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPUT.html
// SSE_KMS is ignored in the above, but in the following it states the same in the highlights:
// https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingKMSEncryption.html
//
if ( S3fsCurl : : is_content_md5 & & sse_type_t : : SSE_C ! = S3fsCurl : : GetSseType ( ) & & sse_type_t : : SSE_KMS ! = S3fsCurl : : GetSseType ( ) ) {
if ( ! etag_equals ( it - > second , partdata . etag ) ) {
return false ;
}
2013-09-14 21:50:39 +00:00
}
2021-08-15 14:34:21 +00:00
partdata . petag - > etag = it - > second ;
2020-08-22 12:40:53 +00:00
partdata . uploaded = true ;
2019-01-27 23:46:43 +00:00
2020-08-22 12:40:53 +00:00
return true ;
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
bool S3fsCurl : : CopyMultipartPostCallback ( S3fsCurl * s3fscurl )
2013-07-05 02:28:31 +00:00
{
2020-08-22 12:40:53 +00:00
if ( ! s3fscurl ) {
return false ;
}
2019-02-25 12:47:10 +00:00
2020-08-22 12:40:53 +00:00
return s3fscurl - > CopyMultipartPostComplete ( ) ;
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
bool S3fsCurl : : CopyMultipartPostComplete ( )
2013-07-05 02:28:31 +00:00
{
2020-08-22 12:40:53 +00:00
std : : string etag ;
partdata . uploaded = simple_parse_xml ( bodydata . str ( ) , bodydata . size ( ) , " ETag " , etag ) ;
if ( etag . size ( ) > = 2 & & * etag . begin ( ) = = ' " ' & & * etag . rbegin ( ) = = ' " ' ) {
2021-01-25 09:02:32 +00:00
etag . erase ( etag . size ( ) - 1 ) ;
etag . erase ( 0 , 1 ) ;
2020-06-30 13:22:38 +00:00
}
2021-08-15 14:34:21 +00:00
partdata . petag - > etag = etag ;
2018-11-15 00:48:57 +00:00
2020-08-22 12:40:53 +00:00
bodydata . Clear ( ) ;
headdata . Clear ( ) ;
2018-11-15 00:48:57 +00:00
2020-08-22 12:40:53 +00:00
return true ;
}
2019-01-28 20:14:04 +00:00
2020-08-22 12:40:53 +00:00
bool S3fsCurl : : MixMultipartPostComplete ( )
{
bool result ;
if ( - 1 = = partdata . fd ) {
result = CopyMultipartPostComplete ( ) ;
} else {
result = UploadMultipartPostComplete ( ) ;
2018-11-15 00:48:57 +00:00
}
2020-08-22 12:40:53 +00:00
return result ;
}
int S3fsCurl : : MultipartHeadRequest ( const char * tpath , off_t size , headers_t & meta , bool is_copy )
{
int result ;
2020-09-11 09:37:24 +00:00
std : : string upload_id ;
2020-08-22 12:40:53 +00:00
off_t chunk ;
off_t bytes_remaining ;
etaglist_t list ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [tpath=%s] " , SAFESTRPTR ( tpath ) ) ;
2018-05-22 13:26:24 +00:00
2020-08-22 12:40:53 +00:00
if ( 0 ! = ( result = PreMultipartPostRequest ( tpath , meta , upload_id , is_copy ) ) ) {
return result ;
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
DestroyCurlHandle ( ) ;
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
// Initialize S3fsMultiCurl
S3fsMultiCurl curlmulti ( GetMaxParallelCount ( ) ) ;
curlmulti . SetSuccessCallback ( S3fsCurl : : CopyMultipartPostCallback ) ;
curlmulti . SetRetryCallback ( S3fsCurl : : CopyMultipartPostRetryCallback ) ;
for ( bytes_remaining = size , chunk = 0 ; 0 < bytes_remaining ; bytes_remaining - = chunk ) {
2021-02-08 11:32:12 +00:00
chunk = bytes_remaining > GetMultipartCopySize ( ) ? GetMultipartCopySize ( ) : bytes_remaining ;
2020-08-22 12:40:53 +00:00
2020-10-01 08:55:34 +00:00
std : : ostringstream strrange ;
2020-08-22 12:40:53 +00:00
strrange < < " bytes= " < < ( size - bytes_remaining ) < < " - " < < ( size - bytes_remaining + chunk - 1 ) ;
meta [ " x-amz-copy-source-range " ] = strrange . str ( ) ;
// s3fscurl sub object
S3fsCurl * s3fscurl_para = new S3fsCurl ( true ) ;
s3fscurl_para - > b_from = SAFESTRPTR ( tpath ) ;
s3fscurl_para - > b_meta = meta ;
2021-08-15 14:34:21 +00:00
s3fscurl_para - > partdata . add_etag_list ( list ) ;
2020-08-22 12:40:53 +00:00
// initiate upload part for parallel
2021-08-15 14:34:21 +00:00
if ( 0 ! = ( result = s3fscurl_para - > CopyMultipartPostSetup ( tpath , tpath , s3fscurl_para - > partdata . get_part_number ( ) , upload_id , meta ) ) ) {
2020-08-22 12:40:53 +00:00
S3FS_PRN_ERR ( " failed uploading part setup(%d) " , result ) ;
delete s3fscurl_para ;
return result ;
}
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
// set into parallel object
if ( ! curlmulti . SetS3fsCurlObject ( s3fscurl_para ) ) {
S3FS_PRN_ERR ( " Could not make curl object into multi curl(%s). " , tpath ) ;
delete s3fscurl_para ;
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
}
2018-11-15 00:48:57 +00:00
2020-08-22 12:40:53 +00:00
// Multi request
if ( 0 ! = ( result = curlmulti . Request ( ) ) ) {
S3FS_PRN_ERR ( " error occurred in multi request(errno=%d). " , result ) ;
2017-03-29 07:13:05 +00:00
2020-08-22 12:40:53 +00:00
S3fsCurl s3fscurl_abort ( true ) ;
int result2 = s3fscurl_abort . AbortMultipartUpload ( tpath , upload_id ) ;
s3fscurl_abort . DestroyCurlHandle ( ) ;
if ( result2 ! = 0 ) {
S3FS_PRN_ERR ( " error aborting multipart upload(errno=%d). " , result2 ) ;
}
return result ;
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
if ( 0 ! = ( result = CompleteMultipartPostRequest ( tpath , upload_id , list ) ) ) {
return result ;
}
return 0 ;
2013-07-05 02:28:31 +00:00
}
2021-08-15 14:34:21 +00:00
int S3fsCurl : : MultipartUploadRequest ( const std : : string & upload_id , const char * tpath , int fd , off_t offset , off_t size , etagpair * petagpair )
2019-02-25 12:47:10 +00:00
{
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [upload_id=%s][tpath=%s][fd=%d][offset=%lld][size=%lld] " , upload_id . c_str ( ) , SAFESTRPTR ( tpath ) , fd , static_cast < long long int > ( offset ) , static_cast < long long int > ( size ) ) ;
2017-03-29 07:13:05 +00:00
2020-08-22 12:40:53 +00:00
// duplicate fd
int fd2 ;
if ( - 1 = = ( fd2 = dup ( fd ) ) | | 0 ! = lseek ( fd2 , 0 , SEEK_SET ) ) {
S3FS_PRN_ERR ( " Could not duplicate file descriptor(errno=%d) " , errno ) ;
if ( - 1 ! = fd2 ) {
close ( fd2 ) ;
}
return - errno ;
}
2015-01-28 17:13:11 +00:00
2020-08-22 12:40:53 +00:00
// set
partdata . fd = fd2 ;
partdata . startpos = offset ;
partdata . size = size ;
b_partdata_startpos = partdata . startpos ;
b_partdata_size = partdata . size ;
2021-08-15 14:34:21 +00:00
partdata . set_etag ( petagpair ) ;
2020-08-22 12:40:53 +00:00
// upload part
int result ;
2021-08-15 14:34:21 +00:00
if ( 0 ! = ( result = UploadMultipartPostRequest ( tpath , petagpair - > part_num , upload_id ) ) ) {
S3FS_PRN_ERR ( " failed uploading %d part by error(%d) " , petagpair - > part_num , result ) ;
2020-08-22 12:40:53 +00:00
close ( fd2 ) ;
return result ;
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
DestroyCurlHandle ( ) ;
close ( fd2 ) ;
2015-01-28 17:13:11 +00:00
2020-08-22 12:40:53 +00:00
return 0 ;
2013-07-05 02:28:31 +00:00
}
2020-08-22 12:40:53 +00:00
int S3fsCurl : : MultipartRenameRequest ( const char * from , const char * to , headers_t & meta , off_t size )
2015-01-28 17:13:11 +00:00
{
2020-08-22 12:40:53 +00:00
int result ;
2020-09-11 09:37:24 +00:00
std : : string upload_id ;
2020-08-22 12:40:53 +00:00
off_t chunk ;
off_t bytes_remaining ;
etaglist_t list ;
2015-01-28 17:13:11 +00:00
2020-08-22 12:40:53 +00:00
S3FS_PRN_INFO3 ( " [from=%s][to=%s] " , SAFESTRPTR ( from ) , SAFESTRPTR ( to ) ) ;
2015-01-28 17:13:11 +00:00
2020-09-11 09:37:24 +00:00
std : : string srcresource ;
std : : string srcurl ;
2020-08-22 12:40:53 +00:00
MakeUrlResource ( get_realpath ( from ) . c_str ( ) , srcresource , srcurl ) ;
2015-01-28 17:13:11 +00:00
2020-09-11 09:37:24 +00:00
meta [ " Content-Type " ] = S3fsCurl : : LookupMimeType ( std : : string ( to ) ) ;
2020-08-22 12:40:53 +00:00
meta [ " x-amz-copy-source " ] = srcresource ;
2017-10-26 14:21:48 +00:00
2020-08-22 12:40:53 +00:00
if ( 0 ! = ( result = PreMultipartPostRequest ( to , meta , upload_id , true ) ) ) {
return result ;
2017-10-26 14:21:48 +00:00
}
2020-08-22 12:40:53 +00:00
DestroyCurlHandle ( ) ;
2017-10-26 14:21:48 +00:00
2020-08-22 12:40:53 +00:00
// Initialize S3fsMultiCurl
S3fsMultiCurl curlmulti ( GetMaxParallelCount ( ) ) ;
curlmulti . SetSuccessCallback ( S3fsCurl : : CopyMultipartPostCallback ) ;
curlmulti . SetRetryCallback ( S3fsCurl : : CopyMultipartPostRetryCallback ) ;
for ( bytes_remaining = size , chunk = 0 ; 0 < bytes_remaining ; bytes_remaining - = chunk ) {
2021-02-08 11:32:12 +00:00
chunk = bytes_remaining > GetMultipartCopySize ( ) ? GetMultipartCopySize ( ) : bytes_remaining ;
2020-08-22 12:40:53 +00:00
2020-10-01 08:55:34 +00:00
std : : ostringstream strrange ;
2020-08-22 12:40:53 +00:00
strrange < < " bytes= " < < ( size - bytes_remaining ) < < " - " < < ( size - bytes_remaining + chunk - 1 ) ;
meta [ " x-amz-copy-source-range " ] = strrange . str ( ) ;
// s3fscurl sub object
S3fsCurl * s3fscurl_para = new S3fsCurl ( true ) ;
s3fscurl_para - > b_from = SAFESTRPTR ( from ) ;
s3fscurl_para - > b_meta = meta ;
2021-08-15 14:34:21 +00:00
s3fscurl_para - > partdata . add_etag_list ( list ) ;
2020-08-22 12:40:53 +00:00
// initiate upload part for parallel
2021-08-15 14:34:21 +00:00
if ( 0 ! = ( result = s3fscurl_para - > CopyMultipartPostSetup ( from , to , s3fscurl_para - > partdata . get_part_number ( ) , upload_id , meta ) ) ) {
2020-08-22 12:40:53 +00:00
S3FS_PRN_ERR ( " failed uploading part setup(%d) " , result ) ;
delete s3fscurl_para ;
return result ;
}
2017-10-26 14:21:48 +00:00
2020-08-22 12:40:53 +00:00
// set into parallel object
if ( ! curlmulti . SetS3fsCurlObject ( s3fscurl_para ) ) {
S3FS_PRN_ERR ( " Could not make curl object into multi curl(%s). " , to ) ;
delete s3fscurl_para ;
2021-01-18 09:50:49 +00:00
return - EIO ;
2020-08-22 12:40:53 +00:00
}
2015-01-28 17:13:11 +00:00
}
2020-08-22 12:40:53 +00:00
// Multi request
if ( 0 ! = ( result = curlmulti . Request ( ) ) ) {
S3FS_PRN_ERR ( " error occurred in multi request(errno=%d). " , result ) ;
2015-01-28 17:13:11 +00:00
2020-08-22 12:40:53 +00:00
S3fsCurl s3fscurl_abort ( true ) ;
int result2 = s3fscurl_abort . AbortMultipartUpload ( to , upload_id ) ;
s3fscurl_abort . DestroyCurlHandle ( ) ;
if ( result2 ! = 0 ) {
S3FS_PRN_ERR ( " error aborting multipart upload(errno=%d). " , result2 ) ;
}
return result ;
}
2013-07-05 02:28:31 +00:00
2020-08-22 12:40:53 +00:00
if ( 0 ! = ( result = CompleteMultipartPostRequest ( to , upload_id , list ) ) ) {
return result ;
}
return 0 ;
2014-12-06 23:51:36 +00:00
}
2014-09-07 15:08:27 +00:00
/*
* Local variables :
2020-08-22 12:40:53 +00:00
* tab - width : 4
* c - basic - offset : 4
2014-09-07 15:08:27 +00:00
* End :
2020-08-22 12:40:53 +00:00
* vim600 : expandtab sw = 4 ts = 4 fdm = marker
* vim < 600 : expandtab sw = 4 ts = 4
2014-09-07 15:08:27 +00:00
*/